Skip to content(if available)orjump to list(if available)

Testing is better than data structures and algorithms

danielmarkbruce

This will annoy a lot of folks, but:

1 - If you work on large scale software systems, especially infrastructure software of most types then you need to know and understand DSA and feel it in your bones.

2 - Most people work on crud apps or similar and don't really need to know this stuff. Many people in this camp don't realize that people working on 1 really do need to know this stuff.

What someone says on this topic says more about what things they have worked on in their life than anything else.

mlinhares

That's going to be true in all fields, people think their experiences are the only valid experiences and everyone else must think and work on what they think is important, otherwise they're wrong.

ecshafer

> What someone says on this topic says more about what things they have worked on in their life than anything else.

This is the crux of the debate. If you work on CRUD apps, you basically need to know hash maps, and lists, but getting better at SQL and writing clean code is good. But there are many areas where writing the right code vs the wrong code really matters. I was writing something the other day where one small in loop operation was the difference betweeen a method running in miliseconds and minutes. Or choose the right data structure can simplify a feature into 1/10th the code and makes it run 100x better than the wrong one.

MoreQARespect

This happens to me too it just happens roughly 100x less than me needing to know how to test properly.

It's never the other day it's 10x a day, every day.

So, OP is still correct.

evrydayhustling

This is so true. When you get DSA wrong, you end up needing insanely complex system designs to compensate -- and being great at Testing just can't keep up with the curse of dimensionality from having more moving parts.

uncivilized

I already know the answer to this, but did you read the article? Ned addresses your concerns.

danielmarkbruce

No, he doesn't. He doesn't discuss the gigantic dividing line between the two different types of systems I categorize above. He also doesn't cover the "feel it in your bones" required in the type 1 systems. Spend a minute reading or listening to Jeff Dean talk, and you'll see what is required to build those types of systems. Spend some time somewhere working on those systems and you'll come across some folks who just have this ready to go and can apply it and the drop of a hat.

matheusmoreira

> Of course some engineers need to implement hash tables, or sorting algorithms or whatever.

> We love those engineers: they write libraries we can use off the shelf so we don’t have to implement them ourselves.

The world needs to love developers like us more. To me it seems only the killer app writing crowd is valued.

jerf

This is one of the things I'd tune in the current curriculum.

When I went to college in the late 1990s, we were right on the verge of a major transition to DSAs being something every programmer would implement themselves to something that you just pick up out of your libraries. So it makes sense that we would have some pretty heavy-duty labs on implementing very basic data structures.

That said, I escaped into the dynamic programming world for the next 15 years or so, so I almost never actually did anything of significance with this. And now even in the static world, I almost never do anything with this stuff directly because it's all libraries for them now too. Even a lot of modern data structures work is just using associative maps and arrays together properly.

So I would agree that we could A: spend somewhat less time on this in the curriculum and B: tune it to more about how to use arrays and maps and less about how to bit bang efficient hash tables.

People always get frosty about trying to remove or even "tune down" the amount of time spent in a curriculum, but consider the number of things you want to add and consider that curricula are essentially zero-sum games; you can't add to them without removing something. If we phrase this in terms of "what else could we be teaching other than a fifth week on pointer-based data structures" I imagine it'll sound less horrifying to tweak this.

Not that it'll be tweaked, of course. But it'd be nice to imagine that I could live in a world where we could have reasonable discussions about what should be in them.

fastaguy88

It really depends. Working on genome analysis, I once encountered/interrupted (by rebooting after a software update) a student who had been running an analysis for more than a week, because they had not pre-sorted the data. With pre-sorted data, it took a few minutes.

Not everyone works on web sites using well-optimized libraries; some people need to know about N and Nlog(N) vs N^2.

ChrisMarshallNY

I agree with the article, but I'll bet a lot of others, don't. Discussions on Code Quality, don't fare well, here. Wouldn't surprise me, if the article already has flags.

Of course, "testing," is in the eye of the beholder.

Some folks are completely into TDD, and insist that you need to have 100% code coverage tests, before writing one line of application code, and some folks think that 100% code coverage unit tests, means that the system is fully tested.

I've learned that it's a bit more nuanced than this[0].

[0] https://littlegreenviper.com/testing-harness-vs-unit/

general1465

Testing, especially vstest.console.exe in Visual Studio has carried my business really far. I have accumulated thousands of tests on my codebase usually based on customer requirements or on past bugs which I have been trying to replicate.

I think that a lot of people dislike testing because a lot of tests can run for hours. In my case it is almost 6 hours from start to finish. However as a software developer I have accumulated a lot of computers which are kind of good and I don't want to throw them out yet but they are not really usable for current development - i.e. 8GB of RAM, 256GB SSD, i5 CPU from 2014 - That would be a punishment to use it with Visual Studio today. But it is a perfect machine for compiling in console i.e. dotnet build or msbuild and running tests via vstest glued together with PowerShell script. So this dedicated testing machine is running on changes over night and I will see if it passed or not and if not fix tests which did not passed.

This setup may feel clunky, but it allows me to make sweeping changes in a codebase and be confident enough, that if the tests pass, it will very likely work for the customer too. The most obvious example where tests were carrying me around has been moving to .NET8 from .NET Framework 4.8. I have went from 90% failure rate on tests to all tests clear in like 3-4 iterations.

hvb2

This feels backwards. When you have a good understanding of data structures you have the luxury of testing.

If you focus on testing over data structures, you might end up testing something that you didn't need to test because you used the wrong data structures.

IMHO too often people dont consider big O because it works fine with their 10 row test case.... And then it grinds to a halt when given a real problem

jancsika

> IMHO too often people dont consider big O because it works fine with their 10 row test case.... And then it grinds to a halt when given a real problem

Not if the user can, say, farm 1,000,000 different rows 100 times over an hour and a half while gossiping with their office mates. I over Excel as Exhibit A.

cogman10

That wasn't the thrust of the article.

The article is saying that it's more important to write tests than it is to learn how to write data structures. It specifically says you should learn which data structures you should use, but don't focus on knowing how to implement all them.

It calls out, specifically, that you should know that `sort` exists but you really don't need to know how to implement quicksort vs selection sort.

hvb2

No, it says learn data structures first, then focus on testing.

You don't have to go super deep on all the sort algorithms, sure. That's like saying that learning testing implies writing a mocking library

glitchc

The article fails to demonstrate how code-tests result in objectively better code. Many comp sci programs have courses on testing that cover TDD, unit testing and fuzzing, among other topics.

Yet much of the safety critical code we rely on for critical infrastructure (nuclear reactors, aircraft, drones, etc) is not tested in-situ. It is tested via simulation, but there's minimal testing in the operating environment which can be quite complex. Instead the code follows carefully chosen design patterns, data structures and algorithms, to ensure that the code is hazard-free, fault-tolerant and capable of graceful degradation.

So, testing has its place, but testing is really no better than simulation. And in simulation, the outputs are only as good as the inputs. It cannot guarantee code safety and is not a substitute for good software design (read: structures and algorithms).

Having said that, fuzzing is a great way to find bugs in your code, and highly recommended for any software that exposes an API to other systems.

azeirah

I don't understand what the difference between a simulation and a test is?

MoreQARespect

>fails to demonstrate how code-tests result in objectively better code.

Tests give the freedom to refactor which results in better code.

>So, testing has its place, but testing is really no better than simulation

Testing IS simulation and simulation IS testing.

>And in simulation, the outputs are only as good as the inputs. It cannot guarantee code safety

Only juniors think that you can get guarantees of code safety. Seniors look for ways to de-risk code, knowing that you're always trending towards a minima.

One of the key skills in testing is defining good, realistic inputs.

cogman10

I agree.

The main benefit of being familiar with how data structures and algorithms work is that you become familiar with their runtime characteristics and thus can know when to reach for them in a real problem.

The author is correct here. You'll almost never need to implement a B-Tree. What's important is knowing that B-Trees have log n insertion times with good memory locality making them faster than simple binary trees. Knowing how the B-Tree works could help you in tuning it correctly, but otherwise just knowing the insertion/lookup efficiencies is enough.

atmavatar

The title is unfortunately more than a little irresponsible, considering it's the norm for many (most?) to read only the title.

There is no dichotomy here: you need to know testing as well as data structures and algorithms.

However, the thrust of the article itself I largely agree with -- that it's less important to have such in-depth knowledge about data structures and algorithms that you can implement them from scratch and from memory. Nearly any modern language you'll program in includes a standard library robust enough that you'll almost never have to implement many of the most well-known data structures and algorithms yourself. The caveat: you still need to know enough about how they work to be capable of selecting which to use.

In the off-chance you do have to implement something yourself, there's no shortage of reference material available.

wjrb

Are there any resources out there that anyone can recommend for learning testing in the way the author describes?

In-the-trenches experience (especially "good" or "doing it right" experience) can be hard to come by; and why not stand on the shoulders of giants when learning it the first time?

Jtsummers

Working Effectively with Legacy Code by Michael Feathers. It spends a lot of time on how to introduce testability into existing software systems that were not designed for testing.

Property-Based Testing with PropEr, Erlang, and Elixir by Fred Hebert. While a book about a particular tool (PropEr) and pair of languages (Erlang and Elixir), it's a solid introduction on property-based testing. The techniques described transfer well to other PBT systems and other languages.

Test-Driven Development by Kent Beck.

https://www.fuzzingbook.org/ by Zeller et al. and https://www.debuggingbook.org/ by Andreas Zeller. The latter is technically about debugging, but it has some specific techniques that you can incorporate into how you test software. Like Delta Debugging, also described in a paper by Zeller et al. https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=988....

I'm not sure of other books I can recommend, the rest I know is from learning on the job or studying specific tooling and techniques.

cogman10

Resources, none that I'm aware of. I generally think this is an OK way to look at testing [1], though I think it goes too far if you completely adopt their framework.

The boil down the tests I like to see. Structure them with "Given/when/then" statements. You don't need a framework for this, just make method calls with whatever unit test framework you are using. Keep the methods small, don't do a whole lot of "then"s, split that into multiple tests. Structure your code so that you aren't testing too deep. Ideally, you don't need to stand up your entire environment to run a test. But do write some of those tests, they are important for catching issues that can hide between unit tests.

[1] https://cucumber.io/docs/bdd/

marcosdumay

When testing job candidates, sure, no doubt about that.

For for learning, no, it's not. You should not spend as much time learning testing as you spend leaning data structures.

Jtsummers

I feel like this mischaracterizes the blog. You seem to be taking this:

> People should spend less time learning DSA, more time learning testing.

And reading it as "More total time should be spent on learning testing than the total time spent learning DSA". That's one reading, another is that people are studying DSA too much, and testing too little. The ratio of total time can still be in favor of studying DSA more, but maybe instead of 10:1 it should be more like 8:1 or 5:1.

marcosdumay

That's a fair point. But then the author makes a blatant and unrealistic generalization about how much time people spend on each of those. Between CS undergrads and introduction to programming bootcamps, the variance on that number is extreme.

karmakaze

The context what you should spend time to learn starting out. TL;DR

> Here is what I think in-the-trenches software engineers should know about data structures and algorithms: [...]

> If you want to prepare yourself for a career, and also stand out in job interviews, learn how to write tests: [...]

I feel like I keep writing these little context comments to fix the problem of clickbait titles or those lacking context. It helps to frame the rest of the comments which might be coming at it from different angles.