C is a powerful and amazingly fast programming language. Still, the most impressive feature it has is the ability and the speed it can empty a room full of developers when who-take-the-decisions says:
“… and it must be coded in C!”
I honestly don’t understand this “fear” of C and why universities are adopting other high-level languages instead of C during the initial courses. It should be the opposite, in my opinion. I believe C is the best programming language for providing an understanding of what is going on inside the machine. It is an exciting learning experience. Please don’t think Java will teach you everything about the device you are working on, remember Java has its virtual machine.
Do you remember my first story where I said: “ … everything you do comes with a cost”? In C, you will see those costs and will learn how to deal with them and also understand why some algorithms must be implemented in C.
The Power Of C
Very import pieces of software are implemented in C, such as the Linux kernel, drivers, compilers, and programming language interpreters, for example. Also, when you hear someone saying:
“… Python is the fastest programming language for multiplying matrices …”
guess what, NumPy is implemented in C.
The proximity to the machine can work wonders. It allows not only you to take control of anything you need but also allows the magic compiler to remove all unnecessary instructions a human-readable code can have. We can call it “the magic of
cc -O3”. Don’t get me wrong; the code must be readable and maintainable; this will not degrade the performance. Still, the compiler is smart enough to make your readable and maintainable code into swift machine code.
It is impressive what the compiler can do with your code to make it fast.
I make use of this power when I realize the code I need to write must be fast. Not fast as taking a few seconds, but fast in the sense that if I take a high-level language, it will take hours, and in C, it would take minutes. Also, C is pretty handy when memory can become an issue. Outstanding examples are algorithms used to compare DNA sequences, such as Smith-Waterman, which is quadratic in time and space.
What About The Costs
I keep telling you about the costs of doing something in your code. But what are those?
Every computer program needs memory, and memory became an abundant resource in the last years, but it is still a problem to more complex memory demanding algorithms. Allocating, managing, and freeing memory is time-consuming and can become complicated. In my opinion, proof of this is the Java Garbage Collector, which is often referred to as being slow. Of course, it is slow, taking care of all that memory allocations, and continually asking if it is still in use, and freeing them when it seems fit, take lots of time. In C, you need to allocate your memory being mindful that you must allocate just what you need, not more, not less, and you must make sure that you free this memory region once it is not required anymore. There is no Garbage Collector “guessing” here.
An important technique, to avoid allocating memory now and then, is claiming a memory region once, splitting it based on the algorithm need, recycle it throughout the code, and free it at the end.
Being close to the machine allowed me to understand how much “time” cost each operation can have. Suddenly it becomes easier to grasp that most of my “old“ code was slow not because I did something in Java or even PHP, but because my implementation was doomed to be slow, also if I would migrate it to C would not have made much of a difference. Nowadays, when I code in C, it is because something must be fast, and because of that, I need to keep research the best approaches for solving my problems. I guess you already came across the comparison between sort algorithms, nobody who wants speed uses bubble sort, I would even say, nobody uses bubble sort, but rather a smarter sorting algorithm. Be mindful of your algorithmic choices.
After some time programming C, it becomes clear that the machine treats everything as numbers. What a surprise?! Strings are sequences, booleans, and even pointers are numbers.
Everything being numbers gave me enormous possibilities when coding. Everything can be used in arithmetic or boolean operations. For example:
This little code snippet:
int a, b, c;
if (b == c) a += 1;
has the same result like this one:
int a, b, c;
a += b == c;
Number handling in C can also create some traps. The expression
a > a + 1 is some times true. If no one is paying attention, it can cause a bug, which will be very hard to spot. A few days ago, I had one of those, and I stared at the screen for a few hours. But how can this happen?!
uint8_t a = UINT8_MAX;
printf("%u\n",a ); // 255
printf("%u\n",a ); // 0
Remember that unsigned integers will rotate if they overflow.
Also, characters are numbers, for example
'a' - 'A' == 32 , and again, it presents lots of possibilities during the development of an algorithm.
A friend of mine always said there is a Pointer Gene in our genetic code. Only people who have this gene can understand C pointers. Of course, that is not true, and I believe pointers is a simple concept to understand, but they demand practice to master them.
But before, I try to explain what a pointer is, be sure that all programming languages have it, but most of them just hide this concept from the developers. Why? Because it makes everything simpler. But without pointers, lots of features wouldn’t exist, such as passing a parameter as reference. And why does not C hide the pointers from us? Because if you can take control of memory, your code will be faster. And for taking control of memory, pointers are needed.
Pointers are memory addresses, which point to memory regions where data is stored. Imagine numbered lockers, and that someone tells you: go to locker 0x0010 and get its contents. You go to the locker 0x0010 and pick it up. Or this someone tells you: go to locker 0x0019 and put this book there. You can imagine what you should do. The numbers 0x0010 and 0x0019 would be the pointers, the lockers the memory, and the locker’s contents the data.
But why use pointers instead of a simple type? That is when references and copies come into play. If simple types are used, the data will be copied around, which can be a slow process (imagine a massive data structure). And also, if you change a copy, nothing happens to the original values. But if you use pointers, the initial data will be read and overwritten through the pointer’s reference.
When we talk about programming languages, it involves a lot of practice, and it is not different for C. C demands lots of practice, I would even say, more than high-level programming languages. There must also be attention when it comes to references (a pointer to a learning source 😉) since C demands attention while coding; there are lots of examples on the internet, which are flawed. But don’t give up on finding your good references. In the end, coding using C is fun and also challenging. It will make you understand how everything works and will give you a new sense of what other programming languages need to accomplish to hide those costs (I keep on mentioning) from you. And don’t give up on a C coding challenge, be the last one in the room saying:
About the writer: Dr. Hugo Wruck Schneider
After quite some years of experience in the Industry and in the Academy, technology and programming languages became Hugo’s second tongue. Entrepreneurship was also part of this journey, among successful and also unsuccessful startups. Today, Hugo is entirely focused and driven by challenging software and data engineering problems. Starting with training complex machine learning models and making them production ready, going through the implementation of high available data engineering pipelines, designing and creating high performant algorithms in GPUs for Bioinformatics, and ending with exploring the applications of quantum computing in the bioinformatics field, Hugo is always up to the challenge. Do not hesitate to connect with Hugo on LinkedIn: https://www.linkedin.com/in/hugo-wruck-schneider/