Collecting the best books mentioned on hacker news, reddit and other places
Both have a huge impact on how I work with code and design them. Trying to explain these concepts are hard without context. Sometimes i just copy/paste the sections i think they could benefit from.
I would try and clean up the bits I was working on.
This is a good book on the topic refactoring a large code base with no tests.
You're definitely right that unit tests are a part of the solution.
can be read in a few different registers (making a case for what unit tests should be in a greenfield system, why and how to backfit unit tests into a legacy system) but it makes that case pretty strongly. It can seem overwhelming to get unit tests into a legacy system but the reward is large.
I remember working on a system that was absolutely awful but was salvageable because it had unit tests!
Also generally getting control of the build procedure is key to the scheduling issue -- I have seen many new project where a team of people work on something and think all of the parts are good to go, but you find there is another six months of integration work, installer engineering, and other things you need to do ship a product. Automation, documentation, simplification are all bits of the puzzle, but if you want agility, you need to know how to go from source code to a product, and not every team does.
If you have to write mocks in the native language, mocks will probably drive you insane.
Tools like mockito can make a big difference.
I worked on a project which was terribly conceived, specified, and implemented. My boss said that they shouldn't even have started it and shouldn't have hired the guy who wrote it! Because it had tests, however, it was salvageable, and I was able to get it into production.
makes the case that unit tests should always run quickly, not depend on external dependencies, etc.
I do think a fast test suite is important, but there are some kinds of slower tests that can have a transformative impact on development:
* I wrote a "super hammer" test that smokes out a concurrent system for race conditions. It took a minute to run, but after that, I always knew that a critical part of the system did not have races (or if they did, they were hard to find)
* I wrote a test suite for a lightweight ORM system in PHP that would do real database queries. When the app was broken by an upgrade to MySQL, I had it working again in 20 minutes. When I wanted to use the same framework with MS SQL Server, it took about as long to port it.
* For deployment it helps to have an automated "smoke test" that will make sure that the most common failure modes didn't happen.
That said, TDD is most successful when you are in control of the system. In writing GUI code often the main uncertainty I've seen is mistrust of the underlying platform (today that could be, "Does it work in Safari?")
When it comes to servers and stuff, there is the issue of "can you make a test reproducible". For instance you might be able to make a "database" or "schema" inside a database with a random name and do all your stuff there. Or maybe you can spin one up in the cloud, or use Docker or something like that. It doesn't matter exactly how you do it, but you don't want to be the guy who nukes the production database (or a another developer's or testers database) because the build process has integration tests that use the same connection info as them.
I've gathered all the book titles in this thread and created Amazon affiliate links (if you don't mind. Otherwise you still have all the titles together :-) )
A Pattern Language, Alexander and Ishikawa and Silverstein http://amzn.to/2s9aSSc
Advanced Programming in the Unix Environment , Stevens http://amzn.to/2qPOMjN
Algorithmics: the Spirit of Computing, Harel http://amzn.to/2rW5FNS
Applied Crytography, Wiley http://amzn.to/2rsULxS
Clean Code, Martin http://amzn.to/2sIOWtQ
Clean Coder, Martin http://amzn.to/2rWgbEP
Code Complete, McConnel http://amzn.to/2qSUIwE
Code: The Hidden Language of Computer Hardware and Software, Petzold http://amzn.to/2rWfR9d
Coders at Work, Seibel http://amzn.to/2qPCasZ
Compilers: Principles, Techniques, & Tools, Aho http://amzn.to/2rCSUVA
Computer Systems: A Programmer's Perspective, O'Hallaron and Bryant http://amzn.to/2qPY5jH
Data Flow Analysis: Theory and Practice, Khedker http://amzn.to/2qTnSvr
Dependency Injection in .NET, Seemann http://amzn.to/2rCz0tV
Domain Driven Design, Evans http://amzn.to/2sIGM4N
Fundamentals of Wireless Communication, Tse and Viswanath http://amzn.to/2rCTmTM
Genetic Programming: An Intrduction, Banzhaf http://amzn.to/2s9sdut
Head First Design Patterns, O'Reilly http://amzn.to/2rCISUB
Implementing Domain-Driven Design, Vernon http://amzn.to/2qQ2G5u
Intrduction to Algorithms, CLRS http://amzn.to/2qXmSBU
Introduction to General Systems Thinking, Weinberg http://amzn.to/2qTuGJw
Joy of Clojure, Fogus and Houser http://amzn.to/2qPL4qr
Let over Lambda, Hoyte http://amzn.to/2rWljcp
Operating Systems: Design and Implementation, Tanenbaum http://amzn.to/2rKudsw
Parsing Techniques, Grune and Jacobs http://amzn.to/2rKNXfn
Peopleware: Productive Projects and Teams, DeMarco and Lister http://amzn.to/2qTu86F
Programming Pearls, Bentley http://amzn.to/2sIRPe9
Software Process Design: Out of the Tar Pit, McGraw-Hill http://amzn.to/2rVX0v0
Software Runaways, Glass http://amzn.to/2qT2mHn
Sorting and Searching, Knuth http://amzn.to/2qQ4NWQ
Structure and Interpretation of Computer Programs, Abelson and Sussman http://amzn.to/2qTflsk
The Art of Unit Testing, Manning http://amzn.to/2rsERDu
The Art of Unix Programming, ESR http://amzn.to/2sIAXUZ
The Design of Design: Essays from a Computer Scientist, Brooks http://amzn.to/2rsPjev
The Effective Engineer, Lau http://amzn.to/2s9fY0X
The Elements of Style, Strunk and White http://amzn.to/2svB3Qz
The Healthy Programmer, Kutner http://amzn.to/2qQ2MtQ
The Linux Programming Interface, Kerrisk http://amzn.to/2rsF8Xi
The Mythical Man-Month, Brooks http://amzn.to/2rt0dAR
The Practice of Programming, Kernighan and Pike http://amzn.to/2qTje0C
The Pragmatic Programmer, Hunt and Thomas http://amzn.to/2s9dlvS
The Psychology of Computer Programming, Weinberg http://amzn.to/2rsPypy
Transaction Processing: Concepts and Techniques, Gray and Reuter http://amzn.to/
Types and Programming Languages, Pierce http://amzn.to/2qT2d6G
Understanding MySQL Internals, Pachev http://amzn.to/2svXuFo
Working Effectively with Legacy Code, Feathers http://amzn.to/2sIr09R
Zen of graphics programming, Abrash http://amzn.to/2rKIW6Q
This is a good high-level overview of the process. I highly recommend that engineers working in the weeds, read "Working Effectively with Legacy Code" , as it has a ton of patterns in it that you can implement, and more detailed strategies on how to do some of the code changes hinted at in this article.
So as to be constructive, I'm going to reference a classic: Working Effectively With Legacy code . Here's a nice clip from an SO answer  paraphrasing it:
"To me, the most important concept brought in by Feathers is seams. A seam is a place in the code where you can change the behaviour of your program without modifying the code itself. Building seams into your code enables separating the piece of code under test, but it also enables you to sense the behaviour of the code under test even when it is difficult or impossible to do directly (e.g. because the call makes changes in another object or subsystem, whose state is not possible to query directly from within the test method).
This knowledge allows you to notice the seeds of testability in the nastiest heap of code, and find the minimal, least disruptive, safest changes to get there. In other words, to avoid making "obvious" refactorings which have a risk of breaking the code without you noticing - because you don't yet have the unit tests to detect that.".
As you get more experience under your belt, you'll begin to see these situations again and again of code becoming large, difficult to reason about or test, and similarly having low direct business benefit for refactoring. But crucially, learning how to refactor as you go is a huge part of working effectively with legacy code and by virtue of that, maturing into a senior engineer -- to strain a leaky analogy, you don't accrue tech debt all at once, so why would it make sense to pay it off all at once? The only reason that would occur is if you didn't have a strong culture of periodically paying off tech debt as you went along.
I'm not going to insinuate that it was necessarily wrong that you decided to solve the problem as you did, and the desire to be proactive about it is certainly not something to be criticized. But it wasn't necessarily right, either. Your leadership should have prevented something like this from occurring, because in all likelihood, you wasted those extra hours and naively thought that extra hours equal extra productivity. They don't. You ought to aim for maximal results for minimal hours of work, so that you can spend as much time as you can delivering results. And, unless you're getting paid by the hour instead of salaried, you're actually getting less pay. So to recap: you're getting less pay, you're giving the company subpar results (by definition, because you're using more hours to achieve what a competent engineer could do with only 40 hour workweeks so you're 44% as efficient), and everyone's losing a little bit. Thankfully, you still managed to get the job done, and because you were able to gain authorship and ownership over the new part of the codebase, you were able to politically argue for better compensation. Good for you, you should always bargain for what you deserve. But, just because you got a more positive outcome doesn't mean you went about it the most efficient way.
The best engineers (and I would argue workers in general) are efficient. They approach every engineering problems they can with solutions so simple and effective that they seem boring, only reaching for the impressive stuff when it's really needed, and with chagrin. If you can combine that with self-advocacy, you'll really be cooking with gas as far as your career is concerned. And, it'll get you a lot further than this silly childish delusion that more hours equals more results, or more pay. Solid work, solid negotiation skills, solid marketing skills and solid communication skills earn you better pay. The rest is fluff.
Check out the book "Working Effectively with Legacy Code", by Michael Feathers.
I believe the basic approach is to write tests to capture the current behaviour at the system boundaries - for a web application, this might take the form of automated end-to-end tests (Selenium WebDriver) - then, progressively refactor and unit test components and code paths. By the end of the process, you'll end up with a comprehensive regression suite, giving developers the confidence to make changes with impunity - whether that's refactoring to eliminate more technical debt and speed up development, or adding features to fulfill business needs.
This way, you can take a gradual, iterative approach to cleaning up the system, which should boost morale (a little bit of progress made every iteration), and minimises risk (you're not replacing an entire system at once).
I've used this approach to rewrite a Node.js API that was tightly coupled to MongoDB, and migrated it to PostgreSQL.
This article is a great.
Similarly Working Effective with Legacy Code by Michael Feathers (http://amzn.to/1UxwVdL) is a great programming book. I appreciate it because It's really nothing but patterns for dealing with bad code (mostly Java, but most of it translates to other languages). Very little why (which I already know), lots of "how to fix X", aka, great signal to noise ratio.
I like this book, it has a lot of tips for situations like these:
I am not sure the above idea is mentioned by Michael Feathers in his amaze book "Working Effectively with Legacy Code" but it is a great idea, and combined with the things that Michael does cover will do you a lot of good!
> > My own preference for the answer is Uncle Bob's description, which is this: technical debt is any production code that does not have (good) tests.
> That's certainly an example of technical debt.
Agreed, it is not the only example, but perhaps it is a good one, as that is a particularly important form of debt that makes the code harder to safely change. I.e. it is a form of technical debt that makes it more expensive to pay off other kinds of technical debt.
Curiously, Michael Feathers has a similar definition of legacy code :
> To me, legacy code is simply code without tests.
Dealing effectively with legacy code:
Working Effectively With Legacy Code by Michael Feathers http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
Debugging with GDB: The GNU Source-Level Debugger by Stallman, Pesch, and Shebs http://www.amazon.com/Debugging-GDB-GNU-Source-Level-Debugge...
The Art of Debugging with GDB, DDD, and Eclipse by Matloff & Salzman http://www.amazon.com/gp/product/1593271743
Also, read this book: http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
It helps a lot and teaches you how to use grep and other tools (a lot more others that I no longer remember) to search and find your way through legacy code.
See if you can talk to the people in your company who hired the contractor. They might at least be able to give you a high-level description of what the software is supposed to do and how it's supposed to work. They might even have specs that they prepared for the contractor or other design documentation.
If the contractor's software has no tests or is poorly written, it's going to be hard to add features to it or refactor it. You might want to read Working Effectively with Legacy Code by Michael Feathers, which describes how you can get a handle on large bodies of legacy software.
This reminds me of a post by Michael Feathers, titled "The carrying cost of code" . Feathers wrote the book about legacy code . I think he makes approximately the same point:
> If you are making cars or widgets, you make them one by one. They proceed through the manufacturing process and you can gain very real efficiencies by paying attention to how the pieces go through the process. Lean Software Development has chosen to see tasks as pieces. We carry them through a process and end up with completed products on the other side.
> It's a nice view of the world, but it is a bit of a lie. In software development, we are essentially working on the same car or widget continuously, often for years. We are in the same soup, the same codebase. We can't expect a model based on independence of pieces in manufacturing to be accurate when we are working continuously on a single thing (a codebase) that shows wear over time and needs constant attention.
This is the standard recommended book:
I write new tests in any area I'm going to be working in.
One book that might give you some useful advice is "Working Effectively with Legacy Code" by Michael Feathers:
Great article that is still relevant today (sadly, I remember reading it when it first came out). If you really, really feel the need to rewrite from scratch, I recommend instead picking up a copy of "Working Effectively with Legacy Code" by Michael Feathers . It will give you ways to improve those terrible code bases while not throwing out the existing code. Plus you'll still get that "new car smell" of working on your code.
If you do decide to add tests to an existing code base, I found "Working Effectively with Legacy Code" to be a good guide. Check out the table of contents.
"Working Effectively with Legacy Code" is by Michael Feathers. http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
Appropriately, it's a book: http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea...
Some automatic tools could help (although I doubt thay'll work on DBase III): Static analysis to see what's there, version control to start at the top and log your way through and be able to rollback to a previous working version.
But it's at the very least weeks of pain.
As the link by lttlrck also advocates: throwing shit out can easily be a mistake. More usually, http://www.amazon.com/Working-Effectively-Legacy-Michael-Fea... + http://www.amazon.com/Refactoring-Improving-Design-Existing-... can get you further, faster. Stuff keeps working while you incrementally improve it.
Working with legacy systems is a black art that I didn't learn about until I took a job supporting and extending one such system. The book I link to above was critical in helping me to understand the approach taken by the team I was working with. It takes a keen, detail-focused mind to do this kind of work.
The approach we took was to create a legacy interface layer. We did this by first wrapping the legacy code within a FFI. We built a test-suite that exercised the legacy application through this interface. Then we built an API on top of the interface and built integration tests that checked all the code paths into the legacy system. Once we had that we were able to build new features on to the system and replace each code path one by one.
Unsurprisingly we actually discovered bugs in the old system this way and were able to correct them. It didn't take long for the stakeholders to stop worrying and trust the team. However there was a lot of debate and argument along the way.
The problem isn't technical. You can simultaneously maintain and extend legacy applications and avoid all of the risks stakeholders are worried about. One could actually improve these systems by doing so. The real problem is political and convincing these stakeholders that you can minimize the risk is a difficult task. It was the hardest part of working on that team -- even when we were demonstrating our results!
The hardest part about working with legacy systems are the huge bureaucracies that sit on top of them.
Essentially, my understand of best practice is to write high level functional tests for the features that appear to work and then use them to ensure there are no regressions as a result of your changes. Someone people even define legacy code as "code without tests".
This is what Michael Feathers calls 'seams' in his book, Working With Legacy Code. Often, you have to do exploratory testing, that is, you don't really know the requirements but you make tests that the current code passes. Then you can refactor it. That way, current code behavior won't be changed.
Very good read, if you need to deal with legacy code and you don't know where to start.
This is actually the type of system (especially if it's very rough code quality wise in many places) I think regression tests are very useful (tests to make sure the system doesn't change function).
A book called "Working effectively with legacy code" by Feathers is great for instrumenting and regression testing old code bases then changing them without breaking them.
I have the book "Working Effectively with Legacy Code", and it's pretty much just "Put things under test, then change them.". Still a useful read, though, if you find yourself working in that sort of thing often (I do).