Actually, I should qualify what I originaly posted a bit. Proprietary formats are a problem, and I'm a big fan of open source software myself. But even with a non-proprietary format, if the program which created the file in the first place is missing, the data may not be able to be read.
I will use an example to illustrate the problem to non-programmers. Let us say there was a team of programmers working for the VA in 1965, and they created a system of programs to maintain a database of veterans. To keep things simple, let's just say that there was a file that contained a veteran's service number, date of birth, date of induction, date of separation, and, I dunno, shoe size. This was a highly structured system where file names were given based on an internally-known numbering system which had some sort of meaning to it and, well, the file name is F060116. When the file was backed up for archiving it was saved as a simple flat file.
Now turn the page. Forty years have passed. All the programmers who worked on the software are retired or dead. The computer system that created the file has long since been junked. The software that was used to create and maintain the file was scrapped in favor of better software on a different format years ago.
Let's say that, for some reason, we have to retrieve information from the archived tape. Sure, it is a nice flat file, stored in ASCII which any tape reader can read, however the tape reader can only tell us that we have a file named F060116 and a typical record in the file looks like this: 0555052902007124206154608230950. Because of the file's cryptic name, we don't have any idea what data is in it. Without knowing what data is in the file, we cannot have much of an idea what the numbers in each record mean. Even if we assume that the first nine numbers make up the service number, we cannot be sure, and we really don't know what the rest of the numbers mean. This file was dutifully and responsibly archived... and it is totally useless. Millions of dollars were spent keeping this file, and others like it, in a climate controlled, nuclear-safe, underground abandoned limestone quarry in the Ozarks AND IT WAS A COMPLETE WASTE!
So even if the software that created the file was non-proprietary, it is no help in the above case because the software that created the file -- that gives definition to the data -- is non-existent. We not only need non-proprietary formats, but we need standardized formats that not only store the data, but describe the data stored.
XML would work well, but how many people are archiving their data this way?