Why software crashes

Programming applications for making music on Linux.

Moderators: MattKingUSA, khz

j_e_f_f_g
Established Member
Posts: 2032
Joined: Fri Aug 10, 2012 10:48 pm
Been thanked: 358 times

Why software crashes

Post by j_e_f_f_g »

Here's a test to see which programmers here have been taught proper error handling.

In prep of researching for an article I may write about creating an LV2 host, I decided to look at LV2 host libraries. First up is Lilv. As is typical of OSS documention, the docs consist of a simple list of APIs with almost no explanation of usage, and a couple of "example" apps which of course are almost completely uncommented. In other words, if a programmer wants to use this lib, he must examine the (uncommented -- surprise, huh?) source code of the lib itself. (And linux endusers wonder why commercial devs won't put any effort into supporting linux?)

Ok so I download the lilv sources, figure out that lilv_world_new() is probably the first function an app will call. I load "world.c" into gedit, and literally within 5 seconds I see:

Code: Select all

LilvWorld* world = (LilvWorld*)malloc(sizeof(LilvWorld));
world->world = sord_world_new();
Pressing the Page Down key, I see:

Code: Select all

LilvSpec* spec = (LilvSpec*)malloc(sizeof(LilvSpec));
spec->spec  = sord_node_copy(specification_node);
And then I see:

Code: Select all

LilvDynManifest* desc = malloc(sizeof(LilvDynManifest));
desc->bundle = lilv_node_new_from_node(world, bundle_node);
Incidentally, this is the same sort of thing you find all over Pulse Audio's sources too.

So the question is: What have these programmers either not been taught, or failed to learn?

In another thread, an enduser wondered why oss seemed so unstable and prone to crash. The answer is because of things like the above.

Author of BackupBand at https://sourceforge.net/projects/backupband/files/
My fans show their support by mentioning my name in their signature.

j_e_f_f_g
Established Member
Posts: 2032
Joined: Fri Aug 10, 2012 10:48 pm
Been thanked: 358 times

Re: Why software crashes

Post by j_e_f_f_g »

falkTX wrote:lilv is... on par with other libs I've used myself too.
I hope not. My point is that, due to very basic, missing error-checking, it's unsafe code that can crash any app that uses it.

Checking the return of malloc() should be one of the first things a Computer Sci student learns. I would never hire a programmer who doesn't know to do this.

Author of BackupBand at https://sourceforge.net/projects/backupband/files/
My fans show their support by mentioning my name in their signature.

j_e_f_f_g
Established Member
Posts: 2032
Joined: Fri Aug 10, 2012 10:48 pm
Been thanked: 358 times

Re: Why software crashes

Post by j_e_f_f_g »

P.S. I see you're using the C++ new operator. I hope you're handling a bad_alloc exception.

Author of BackupBand at https://sourceforge.net/projects/backupband/files/
My fans show their support by mentioning my name in their signature.

User avatar
raboof
Established Member
Posts: 1855
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Has thanked: 50 times
Been thanked: 74 times
Contact:

Re: Why software crashes

Post by raboof »

j_e_f_f_g wrote:In another thread, an enduser wondered why oss seemed so unstable and prone to crash. The answer is because of things like the above.
Only if you mean 'things like this' rather broadly.

A typical Linux installation will overcommit on memory, so a malloc() of such small structures is highly unlikely to return NULL even in an OOM situation. You've got bigger problems (processes getting killed randomly) at that point.

You could argue it would still be useful to do error-checking here, but mostly because that is simply "how it should be done" (which might sound dogmatic but actually has some advantages).

It'd be interesting to see what most actual instabilities stem from. A tool like "apport" might be nice for that, though I haven't looked at it in detail myself so I can't really recommend it yet. In any case the availability of such a diagnostic tool can not be an excuse for not getting it right the first time, but might give some insight in where things typically go wrong.
male
Established Member
Posts: 232
Joined: Tue May 22, 2012 5:45 pm

Re: Why software crashes

Post by male »

Wrong. This is the least likely cause you could imagine for a segfault. I suppose next you're going to tell us that C programs crash because of 'incorrect' indentation.

I seem to recall just about everyone who tried your midiview and edrummer programs reporting an immediate crash. Why don't you begin your instruction by showing examples of your own bugs?
Image
j_e_f_f_g
Established Member
Posts: 2032
Joined: Fri Aug 10, 2012 10:48 pm
Been thanked: 358 times

Re: Why software crashes

Post by j_e_f_f_g »

male wrote:Wrong. This is the least likely cause you could imagine for a segfault.
The issue with the OOM (ie, Out Of Memory) Manager is as usual another example of you arguing with your own straw man. It's a total fallacy programmers have that malloc() won't return 0 due to the OOM.
male wrote:I seem to recall just about everyone who tried your midiview and edrummer programs reporting an immediate crash.
As usual, you "recall" incorrectly. I'm sure you're thinking about your own software instead.

http://linuxmusicians.com/viewtopic.php ... 891#p39928
http://linuxmusicians.com/viewtopic.php?f=1&t=11008
http://linuxmusicians.com/viewtopic.php?f=1&t=10970

Author of BackupBand at https://sourceforge.net/projects/backupband/files/
My fans show their support by mentioning my name in their signature.

male
Established Member
Posts: 232
Joined: Tue May 22, 2012 5:45 pm

Re: Why software crashes

Post by male »

j_e_f_f_g wrote:
male wrote:Wrong. This is the least likely cause you could imagine for a segfault.
The issue with the OOM (ie, Out Of Memory) Manager is as usual another example of you arguing with your own straw man. It's a total fallacy programmers have that malloc() won't return 0 due to the OOM.
male wrote:I seem to recall just about everyone who tried your midiview and edrummer programs reporting an immediate crash.
As usual, you "recall" incorrectly. I'm sure you're thinking about your own software instead.

http://linuxmusicians.com/viewtopic.php ... 891#p39928
http://linuxmusicians.com/viewtopic.php?f=1&t=11008
http://linuxmusicians.com/viewtopic.php?f=1&t=10970
When did I even mention the OOM killer? All you're doing here, Jeff, is proving that you're out of touch and don't know anything about that which you criticise. Do you offer some mechanism to prevent crashes? No. Do you really think that the people who wrote that code don't know that malloc() could possibly return NULL? If you do, then you're just once more proving how out of touch you are with reality. Again, why don't you analyse why your own software crashes and post that? Oh, wait, I know, because software crashes are a completely general problem and have nothing to do with Linux Audio or even Linux. Maybe C, but that's about as specific as the problem gets. You're doing nothing here but displaying your own foolishness for everyone to see and laugh at. And while I enjoy a bit of entertainment as much as the next guy, I grow tired of your lame old gag.
Image
User avatar
raboof
Established Member
Posts: 1855
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Has thanked: 50 times
Been thanked: 74 times
Contact:

Re: Why software crashes

Post by raboof »

Guys, I'm going to leave the above posts alone, but be careful. If this is going to turn into a 'your code is shittier than mine'-contest I'll moderate.

Of course if you want to take specific bugs (your own or others') and explore what exactly caused the problems and how such issues could be prevented that's all good.
nils
Established Member
Posts: 538
Joined: Wed Oct 22, 2008 9:05 pm
Has thanked: 35 times
Been thanked: 94 times
Contact:

Re: Why software crashes

Post by nils »

Yes, quote code excerpts and comment them, please.
male
Established Member
Posts: 232
Joined: Tue May 22, 2012 5:45 pm

Re: Why software crashes

Post by male »

raboof wrote:Guys, I'm going to leave the above posts alone, but be careful. If this is going to turn into a 'your code is shittier than mine'-contest I'll moderate.

Of course if you want to take specific bugs (your own or others') and explore what exactly caused the problems and how such issues could be prevented that's all good.
Fair enough. IMHO, it's quicker and more productive to fix bugs than complain about them in a public forum that the author doesn't even read, so you won't find me line-by-line auditing anyone's code here.
Image
User avatar
raboof
Established Member
Posts: 1855
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Has thanked: 50 times
Been thanked: 74 times
Contact:

Re: Why software crashes

Post by raboof »

j_e_f_f_g wrote:
male wrote:Wrong. This is the least likely cause you could imagine for a segfault.
The issue with the OOM (ie, Out Of Memory) Manager is as usual another example of you arguing with your own straw man.
You seem to be confusing me and male - I was the one who brought up memory overcommit and OOM.

I'm not sure which 'straw man' you claim I'm arguing. You claimed the code was buggy because it didn't check for malloc() returning NULL, and I put it into perspective by claiming that 1) the only situation where that would do any good would be in an OOM situation, and 2) it wouldn't do much good in an OOM situation on a typical system.
j_e_f_f_g wrote:It's a total fallacy programmers have that malloc() won't return 0 due to the OOM.
Uh, no. (I'll ignore the 0-versus-NULL debate to try and keep this on-topic)

On Linux, due to memory overcommit, malloc() might not return NULL even if there's insufficient memory to back your malloc(). A simple example program can demonstrate this: this program will try to allocate 12 gigs of memory. On my configuration, all these malloc() calls return a non-NULL value. Obviously, since my machine doesn't have 12 gigs of memory (and I don't use swap), this can't work - and indeed if i try to actually use the memory (in this case: writing some 'y' characters into it), it'll grow too big and get killed by the OOM killer.

Code: Select all

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <assert.h>

int main() {
  unsigned long gig_in_bytes = 1024 * 1024 * 1024;

  // This example assumes an architecture where the smallest addressable unit
  // is a byte, and the maximum size of size_t is at least a gig (i.e. any
  // modern system with a 32-bit architecture)
  assert(sizeof(unsigned char) == 1);
  assert(gig_in_bytes < SIZE_MAX);

  int gigs_to_alloc = 12;
  unsigned char * allocated_chunks[gigs_to_alloc];
  int i;
  unsigned long j;

  // first allocate a generous amount of memory. Depending on overcommit
  // settings, this might not return NULL even if this allocates more than
  // the physically available amount of memory.
  for (i = 0; i < gigs_to_alloc; i++) {
    allocated_chunks[i] = (unsigned char*) malloc(gig_in_bytes);
    assert(allocated_chunks[i] != NULL);
    printf("Malloc'ed %d gig in total now\n", i);
  }

  // Now actually use the memory (by writing into it)
  for (i = 0; i < gigs_to_alloc; i++) {
    for (j = 0; j < gig_in_bytes; j += 10000)
      allocated_chunks[i][j] = 'y';
  }

  printf("Done\n");

  return 0;
}
So, this example shows even allocating huge 1-gig chunks of memory on a machine that doesn't have them available doesn't always make malloc() return NULL - which j_e_f_f_g above claimed was a 'total fallacy'. Therefore I stand by my earlier claim that checking the return value of malloc() for a small number of small allocations is unlikely to improve the stability of your application when running on a typical Linux system.

Of course this doesn't necessarily mean checking the return value of malloc() is useless. It's not that hard to think of scenario's where it could be a good idea. Nonetheless, I hope this does put j_e_f_f_g's bold claims above into some perspective.
User avatar
raboof
Established Member
Posts: 1855
Joined: Tue Apr 08, 2008 11:58 am
Location: Deventer, NL
Has thanked: 50 times
Been thanked: 74 times
Contact:

Re: Why software crashes

Post by raboof »

male wrote:IMHO, it's quicker and more productive to fix bugs than complain about them in a public forum that the author doesn't even read.
Well, if we can learn something from it that makes it 'productive' in my book. I hope that'll happen :).
j_e_f_f_g
Established Member
Posts: 2032
Joined: Fri Aug 10, 2012 10:48 pm
Been thanked: 358 times

Re: Why software crashes

Post by j_e_f_f_g »

male wrote:When did I even mention the OOM killer?
In order for my statement of "My point is that, due to very basic, missing error-checking (ie not checking malloc returning 0), it's unsafe code that can crash any app that uses it." to be "wrong" (as your reply erroneously contends), then the following assumptions must be made:

1) malloc will never return 0 due to over-committing.
2) A reference to over-committed mem will result in the OOM Manager "safely" recovering enough memory to satisfy the app's reference, such that the OOM Manager won't abruptly terminate the app. (ie, The app essentially "crashes").

Both of the above are incorrect assumptions. I merely pointed out the first incorrect assumption as evidence that your contention about me being "wrong" is, as usual, wrong.

I see now that your contention wasn't even based upon one of the above mis-assumptions, but rather yet another example of you engaging in mere "truth by proclamation". Do you ever back up any of your statements with facts, or do you always resort to useless ad hominem accusations that the other person allegedly "doesn't know what (he's) talking about" (just because you say so), which is the entire content of the remainder of your reply. Typical.

Author of BackupBand at https://sourceforge.net/projects/backupband/files/
My fans show their support by mentioning my name in their signature.

male
Established Member
Posts: 232
Joined: Tue May 22, 2012 5:45 pm

Re: Why software crashes

Post by male »

j_e_f_f_g wrote:
male wrote:When did I even mention the OOM killer?
In order for my statement of "My point is that, due to very basic, missing error-checking (ie not checking malloc returning 0), it's unsafe code that can crash any app that uses it." to be "wrong" (as your reply erroneously contends), then the following assumptions must be made:

1) malloc will never return 0 due to over-committing.
2) A reference to over-committed mem will result in the OOM Manager "safely" recovering enough memory to satisfy the app's reference, such that the OOM Manager won't abruptly terminate the app. (ie, The app essentially "crashes").

Both of the above are incorrect assumptions. I merely pointed out the first incorrect assumption as evidence that your contention about me being "wrong" is, as usual, wrong.

I see now that your contention wasn't even based upon one of the above mis-assumptions, but rather yet another example of you engaging in mere "truth by proclamation". Do you ever back up any of your statements with facts, or do you always resort to useless ad hominem accusations that the other person allegedly "doesn't know what (he's) talking about" (just because you say so), which is the entire content of the remainder of your reply. Typical.
Let's try this, genius: Why don't you sift through the bug database of any of the many large projects that have a policy of never checking the return value of malloc() for NULL and tell us how many bugs you find that are attributable to the fact? You have asserted that not checking the return value of malloc() for NULL is why software crashes, and you are quite and thoroughly wrong. That is not the reason. Do whatever it takes to convince yourself of this, but don't waste this forum's time with your foolishness and misinformation.
Image
j_e_f_f_g
Established Member
Posts: 2032
Joined: Fri Aug 10, 2012 10:48 pm
Been thanked: 358 times

Re: Why software crashes

Post by j_e_f_f_g »

j_e_f_f_g wrote:It's a total fallacy programmers have that malloc() won't return 0 due to the OOM.
raboof wrote:Uh, no.
Yes. It's a fallacy that malloc won't return 0. If malloc deduces that there's no way it can fullfill a request, for example due to memory fragmentation, exceeding a ulimit setting (or other mem management settings, such as overcommit_memory), etc, then malloc will return 0.

http://stackoverflow.com/questions/2248 ... -uses-over
http://voices.canonical.com/jussi.pakka ... -and-linux
http://compgroups.net/comp.unix.program ... ull/471850
raboof wrote:On Linux, due to memory overcommit, malloc() might not return NULL even if there's insufficient memory to back your malloc().
The key word being "might". You do realize that you're tacitly admitting that my above statement is true?
raboof wrote:allocating chunks of memory on a machine that doesn't have them available doesn't always make malloc() return NULL - which j_e_f_f_g above claimed was a 'total fallacy'.
That is not what I wrote. Reread my text which you quoted.
raboof wrote:Therefore I stand by my earlier claim that checking the return value of malloc() for a small number of small allocations is unlikely to improve the stability of your application when running on a typical Linux system.
And I stand by my claim that it should always be done, and that assumptions it's safe/pointless not to do it are incorrect.

Author of BackupBand at https://sourceforge.net/projects/backupband/files/
My fans show their support by mentioning my name in their signature.

Post Reply