Welcome to the GNU Core Utilities FAQ. This document answers the most frequently asked questions about the core utilities of the GNU Operating System.
This master location of this document is available online at http://www.gnu.org/software/coreutils/faq/.
If you have a question that is not answered in this FAQ then please check the mailing list archives. If you find a useful question and answer please send a message to the bug list and I will add it to the FAQ so that this document can be improved. If you still don't find a suitable answer, consider posting the question to the bug lists.
An excellent collection of FAQs is available by anonymous FTP at rtfm.mit.edu and in particular the Unix FAQ is pertinent here. ftp://rtfm.mit.edu/pub/usenet/news.answers/unix-faq/faq/contents
This FAQ was written by Bob Proulx <email@example.com> as an amalgamation of the many questions asked and answered on the bug lists.
Copyright © 2001, 2002, 2003, 2004, 2005 Free Software Foundation
This document is free documentation; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This document and any included programs in the document are distributed in the hope that they will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
An online version is available at http://www.gnu.org/licenses/licenses.html.
Previously a set of three packages combined implement the core set of GNU utilities. These were the GNU fileutils, shellutils and textutils. (Additionally shellutils was one letter too long for 14 character limited filesystems and was also known as sh-utils.) Each had their own web page. Each had their own mailing list. But all three were generally considered a set.
These three packages of
textutils were combined into the current
package. This greatly simplifies the maintenance and management of
this project. The older packages are deprecated. All users are
requested to update to the latest stable release of
info documentation is always the most up to date
source of information. It should always be consulted first for the
latest information on your particular installation. Here are example
commands to invoke info to browse the documentation.
info coreutils info coreutils ls info coreutils sort info coreutils head
Additionally the home page contains the canonical top level information and pointers to all things GNU coreutils.
Please browse the mailing list archives. It is possible, even likely, that your question has been asked before. You might find just the information you were looking for. Many questions are asked here and at least a few are answered.
Development releases of GNU coreutils source code can be downloaded from ftp://alpha.gnu.org/gnu/coreutils. These are test releases and although are generally of good quality have not been tested well enough nor matured enough to be considered stable releases. Also by definition stable means stable and the test releases change frequently. Use with care.
However when possible it helps to report any bugs as seen against the latest test release version as that one is the one in current development. Great effort is spent to ensure that the software builds easily on a large number of different systems. Compiling the coreutils is probably easier than you think if you have not done it before. If possible please build and test the latest test release to see if your problem is resolved there. (A note for Cygwin users. The MS-Windows platform requires special handling. For Cygwin I recommend using the latest precompiled binaries from the Cygwin site. They do an excellent job of handling the peculiarities and deficiencies of that platform.)
Patches for bugs and enhancements are always appreciated. Please submit patches as unified diffs against the latest CVS sources if possible. If unable to access the CVS from Savannah then please use the latest test release that is available.
Send a plain text e-mail to the <firstname.lastname@example.org> mailing list. Also with the Frequently Given Answer if you have one. Please make sure the message is formatted as a plain text message. Do not send HTML encoded messages.
Please allow some time for processing. Many people read the mail so please try to be concise. Even if you do not receive a personal response your message will have been read by the maintainers. But so many messages are posted that it can be overwhelming for us at times.
Please be sure that the bug you are reporting is actually related to the GNU coreutils. Many times people report problems in other unrelated programs. Those may be legitimate bugs but the GNU coreutils maintainers are in no position to be able to do anything about other people's software. People have reported bugs in their sound cards and crashes of their disk drives and panics of their operating system kernel. We can't help you about any of those things.
If possible please test for the presence of your problem using the latest version of the software. The GNU utilities are widely used and many times the bug will already have been found and fixed in a later version.
Check the mailing list archives to see if someone else has already reported your problem. Keep in mind that many people will be reading the list and it can be difficult to keep up with the volume repeats. It can also be useful to search the gnu.org site using your favorite Internet search engines by specifying a site:lists.gnu.org search restriction.
If you think you have a real bug then send mail to the bug discussion list. Use a subject that is descriptive of the issue. Think of how difficult it is to follow a thread of discussion when every subject line is simply bug.
Examples of good subjects:
mv && hardlinks problem dd and skip on Linux tape devices and pipes assertion failure in mv command fails to compile on platform xyz
Examples of bad subjects:
I have a question. help! bug? bug report URGENT
In your description indicate the version of the program that you are seeing with the problem. Also note the operating system type and version. The GNU utilities are part of the GNU Operating System but have been ported to run standalone on many different types of systems and some problems will be unique to them. We are good at guessing your environment but it is much simpler if we have the information without guessing.
Please make sure the message is formatted as a plain text message. Do not send HTML encoded messages. Do not send overly large messages without first establishing contact. Once someone found a way to trigger a problem with sort by giving it a 200MB sized file. They sent that datafile to the mailing list. Needless to say that was not appreciated. People on limited bandwidth connections reading the list were severely effected.
Include as small of a test case as you can manage to create that will allow others to recreate the problem. If the problem cannot be recreated then it is very difficult to diagnose or fix.
Confused by all of this? Don't worry. We all start somewhere. Here is a pointer to an article written by Eric S. Raymond on how to ask good questions.
Patches to the source code are always appreciated. But reports of bugs are always welcome even if you do not feel comfortable working in the source code.
It may just be that the volunteers are busy. Please be patient. Every message to the bug lists is read by many people. Sometimes there is just nothing to say at the time. If sufficient time has elapsed, say one or two weeks, and you think your message may have been forgotten it is perfectly acceptable to send a second message asking about your first message.
But if you think your message might not have made it to the list or that you may have missed the response then check the mail archives as described above. If you do not see your message there then we did not see it either.
Or perhaps your formatting was such that it was unreadable. A number of posts are unreadable because the text was encoded and it could not be deciphered. If it does not show up as plain text in the archive then it did not get sent to the list in plain text format.
The hard work the Cygwin team has put in to developing the GNU Project on MS-Windows is greatly admired. However the GNU team generally uses GNU Operating Systems and do not have access to Cygwin systems or MS-Windows systems which means that most of us can't help you. It would be most appreciated if you would make your bug report directly to the Cygwin folks. They are the experts and best suited to handle your problem.
The GNU chown program will change the ownership if the operating system it is running upon allows it. If you can't change file ownership then it is the operating system which is restricting you and not the chown program.
Actually, the GNU chown command does not know if this is the policy of the system or not. It calls the kernel system call chown() just like any other program (e.g. perl, ruby, etc.) If the OS allows it then it will change the ownership of the file. Different systems handle this differently. Traditional System V UNIX systems allow anyone to give a file away to other owners. On those systems GNU chown does change the ownership of files.
But on most modern systems BSD semantics are followed and only the superuser can change the ownership of the file. The problem for documenting this is that GNU chown does not know which it will be running on. It could be one or it could be the other. Or it might even be running on a system without the concept of file ownership at all! This is really an OS policy decision and it is hard to track documentation to be different on different systems. But the documentation must be independent of operating system.
The reason an operating system needs to restrict changing ownership is mostly threefold.
That was fixed in test releases for fileutils-4.1. Newer versions fix that problem.
Since the file name begins with a '-' it looks like an option to the command. You need to force it to not look like an option. Put a ./ in the front of it. Or give it the full file name path. Or tell the command you are through with options by using the double dash to end all option processing. This is common to most traditional UNIX commands.
rm ./-stuff rm /full/path/-stuff rm -- -stuff
And the same for other utilities too.
mv ./-stuff differentstuff mv -- -stuff differentstuff
This is just a variant of the previous question. It also applies to files named '-i' and the like. But this question is asked often enough that it deserved an entry specifically for it.
rm ./--help rm -- --help rm ./-i rm -- -i
In fact touching a file called -i in a directory is an old trick to avoid accidentally saying rm * and having it remove all of the files. Since the * expands to match all file names the first such name will be the -i. That will make the command rm -i file1 file2. As you can see that will cause rm to prompt you and if that is not what you wanted them you can interrupt the command. I don't personally like this and don't recommend it.
I have a directory containing some files and also '-f' as a filename and
rm thinks I gave it the
-f option. Why?
$ ls -f bar foo $ rm -i * $ ls -f
This is not a bug. This is normal behavior. The shell expands file globs such as '*'. The command that rm is seeing is the following. You can test that with the echo command. echo rm -i *
rm -i -f bar baz foo fum
The -f option to rm overrides the -i option. Therefore the files are removed without asking. Since the shell expands globs like '*' before the program sees the command line it cannot distinguish between something the user typed and something that was expanded by the shell. The shell filters all command lines. This is a good thing and adds a lot of power to the system but it means you have to know that the shell expansion filter is there to write robust scripts and command lines.
To robustly write a command that does what you are wanting you need to do one of the following:
rm -i ./* rm -i -- *
See the previous question with regards to a filename -i.
This question is asked a number of ways:
rm -r *.exe chmod -R 744 *.pl chown -R user:user *.html
This is the same correct behavior as other typical programs such as ls -R, chmod -R, chown -R, etc. Try 'ls -R *.exe', for example. Here are the pieces of information you need to understand what is happening.
The -r (and -R) option says that if any of the files listed on the command line are a directory then recurse down through those directories. Only arguments to the program which are directories are recursively acted upon. So any program argument which is a directory will be removed completely which would mean recursing down that directory and removing anything below it. But if the command line argument (after shell expansion) is not a directory then it won't go searching trying to find a match.
Here is another piece of information to understand the behavior. The shell interpreter is expanding the command line glob characters prior to handing the arguments off to your command. This is a simple form of regular expression matching designed to make file name matching easier. This provides a consistent interface to all programs since the expansion code is common to all programs by being in the interpreting shell instead of in the program itself. Commands in UNIX do not see the '*.exe' or any of the shell metacharacters. Commands see the expanded out names which the shell found matched file names in current directory.
The '*' is the "glob" character because it matches a glob of characters. But it only matches files in the current directory. It does not go out and list files in other directories. The shell matches and expands glob characters and hands of the resulting information to the command.
You can double check this by using the echo command. This is built into most command shells, for example into bash. Try echo *.exe. Try echo */*.exe. In your example the first would print out *.exe if nothing matched but would print out all file names that did match. The command would see the result and has no idea that you provided a wild card to match against file names.
If you want to match files in subdirectories as well then you would need to say so explicitly with */*.exe. The first star would match all file names in the current directory. Then the second *.exe would be matching files in the subdirectories under names already matched by the first '*' glob.
All of that was to explain why things are working as they should. But here is what you really want to do. If you want to search all directories below a location, let's say your present working directory, then you can use other UNIX commands such as 'find' to do so. Here is an example, untested, use at your own peril, that would do your rm on all .exe files below your current working directory.
Works for small numbers of files:
chgrp mygrp $(find . -name '*.html' -print)
The $(command) execute the find command, take the output of the find command and place it right there on the command line in place of the $() and hand the results off to the rm command. BEWARE! Test this out with the echo command prior to real usage! Note that the '*.html' is quoted to keep the shell from expanding it. The find command will do the expansion itself in this case and so the '*' needs to be hidden in a string to keep the shell from expanding it first.
echo rm -f $(find . -name '*.exe' -print) echo chmod u+w $(find . -name '*.html' -print)
Unfortunately this transferal of functionality from the command to the shell comes at a cost. There is a limited amount of argument space that is available for this argument expansion of file names. It is different on different systems and getting larger as RAM gets cheaper but almost always there is still limit. 20KB was typical for a time and now 2MB is common but it is a limit regardless and additionally it is usually shared with environment variable space. The xargs command was designed specifically to work around this limited argument space limit. If you have a HUGE subdirectory with thousands of files the above command will fail execute. Therefore a better method is to use find coupled with xargs.
find . -name '*.exe' -print | xargs chmod a+x
Robust and safer but not yet universally implemented using the -print0 option and zero terminated strings instead of newline terminated strings:
find . -name '*.exe' -print0 | xargs -0 chmod a+x
You could substitute a full path in place of the 'find .' such as something like 'find /class/home'. Note that these are pretty much equivalent to the following question which might be easier to understand what is going on.
It is extremely unlikely that you will be able to recover a file from the filesystem after you have removed it. It depends upon your operating system and the filesystem used with it. The disk blocks storing the file would have been freed and very likely reused by other files. This is especially true on an active file system. However, having said that disclaimer, it is frequently possible to recover deleted files.
Matt Schalit <mschalit at pacbell dot net> suggests in a message to the list that the user may want to look into The Coroner's Toolkit, unrm, and lazarus.
Ciaran O'Riordan <coriordan at compsoc dot com> wrote suggesting that users should look at the programs 'recover' and 'gtkrecover'. The latter being a gui interface for the former. The tool has been used with great success when O'Riordan's 700 MB music collection was deleted. It is GPL and is a Debian package. It is ext2 specific but is available for both GNU/Linux and GNU/Hurd. Also, the 'recover' homepage has a link to a manual way to retrieve deleted files.
On Debian systems:
apt-get install recover
See also “Linux Ext2fs Undeletion mini-HOWTO” by Aaron Crane at
contains very useful information about this topic. This may already
be present on your system at
/usr/share/doc/HOWTO/en-txt/mini/Ext2fs-Undeletion.gz or might
be found other places using the
See also “Linux ext3 FAQ” by Juri Haberland at http://batleth.sapienti-sat.org/projects/FAQs/ext3-faq.html. Thanks to Paul Dorweiler for submitting that reference for this FAQ.
Q: How can I recover (undelete) deleted files from my ext3 partition?
Actually, you can't! This is what one of the developers, Andreas Dilger, said about it:
In order to ensure that ext3 can safely resume an unlink after a crash, it actually zeros out the block pointers in the inode, whereas ext2 just marks these blocks as unused in the block bitmaps and marks the inode as "deleted" and leaves the block pointers alone.
Your only hope is to "grep" for parts of your files that have been deleted and hope for the best.
You don't want your files to be recovered and you want them to stay
deleted? The GNU coreutils package includes a utility called
which may be of some use to prevent it. Read the documentation on
that program for more details especially the exceptions section where
shred does not work. This operation does not work on journaled
People frequently wonder why
do not support find type of operations. This is part of the
subtle but beautiful design of the UNIX system. Programs should be
simple and modular. Common behavior should be modularized into a common
location where it can be used by other programs. More complicated
programs are created by chaining together simpler programs.
Here is an example of the UNIX philosophy in action. Let's say you only want to restrict directory listings to only show directories. The normal thing would be to use a combination of commands. This will do what you want.
ls -al | grep ^d
This is assuming you are using bash, some other shells such as the old sh have ^ as being special, a synonym for |, and you would have to quote it, this will generate the full listing and the grep would only show you the lines starting with 'd'. Those are the directories. Doing this type of command chaining is a very typical way of doing things under UNIX.
When people find that particular combinations of commands are ones that they use a lot then they will typically create a shell script, a shell function or a shell alias which does this with a shorter name. I like shell scripts because they are easier to pass around. If you found the above command one that you always used then perhaps you would create the following shell script.
#!/bin/sh ls -al "$@" | grep ^d
You could call it whatever you desired. Then it would work just like ls but with your special behavior. For all intents and purposes it would be just like a normal UNIX command and its output could be piped into other commands.
I typed in ls -dl but it only showed me the current directory.
Well, yes, that is what it is supposed to do. The default directory to list if none is specified is the current directory. The -d option prevents recursively listing the directory. Therefore ls -ld lists only attributes of the current directory.
But I expected to see all of the directories.
That would be the output of 'ls -l' without the -d option.
The -d option is meant to prevent ls from listing the contents of directories when you only want it to list the names of the directories.
To understand the usefulness of the -d option try this example. Compare the differences in the results of ls when you use the -d option versus when you do not.
ls -ld /etc/*.d ls -l /etc/*.d
If you are trying to find files in the directory hierarchy then you should look into using the find command. It is very powerful and contains an interface which can be used with many other programs.
When I list directories using 'ls *', or even 'ls -a *' I never see hidden files, e.g. files with names that start with a dot as in .profile.
The command shell expands the '*' before ever handing it to a command. This is regardless of it being ls as it could be any command on the command line. The '*' is called the glob character because it matches a glob of character. This process is called file name globbing. This is documented in your shell manual. Bash documents this well. In the bash info pages look for the section titled “Filename Expansion”.
A '*' is a shell pattern and is replaced by a list of files that match that pattern. Filenames that start with a '.' do not match that pattern. Neither does it match a '?'. The dot character must be explicitly matched when it occurs at the start of the filename.
You should test what input is being given to commands by the shell with the echo command. Try these patterns as starting examples. Try this in your home directory where there are usually rich examples of dot files.
echo * echo .* echo .* * echo .?* echo .??*
As you will see from those examples the ls command is only listing out the files that were presented to it by the shell. Which answers why the dot files were not listed. In fact, what is ls doing that you can't do yourself? Very little in this particular case. You might as well use echo for listing and then you can use the fmt command to word wrap to your screen. You can also use tr and grep to finish the job. Suddenly using ls for this seems more convenient. Especially when coupled with the -l option!
echo * | fmt echo .* | fmt echo .* * | fmt echo .* * | tr " " "\012" | grep profile
Okay, now I understand why ls * did not list out hidden dot files. But then how does one list out dot files?
There are several ways that are typically used. Here are a variety of examples that should spark ideas for you. Try them out and compare and contrast their differences.
ls -a | grep profile ls -d .* ls -d .??* ls -d .[^.]* ls -d .[!.]*
Some are more convenient than others. But dot files are meant to be hidden files. Therefore it is reasonable that you will need to do a little more work to unhide them.
You should also read over the answers to other questions in this FAQ as this is a very similar theme. Read the documentation on your shell. For GNU systems info bash will launch the info system on the bash documentation. I also recommend reading one of the many fine shell programming books available from the bookstore.
I tried to move about 5000 files with mv, but it said:
bash: /bin/mv: Argument list too long
The UNIX operating system traditionally has a fixed limit for the amount of memory that can be used for a program environment and argument list combined. You can use getconf to return that limit. On my Linux system (2.2.12) that amount is 128k. On my HP-UX system (11.0) that amount is 2M. It can vary per operating system. POSIX only requires 20k which was the traditional value used for probably 20 years. Newer operating systems releases usually increase that somewhat.
getconf ARG_MAX 131072
Note that your message came from "bash" your shell command line interpreter. Its job is to expand command line wildcard characters that match filenames. It expands them before any program can see them. This is therefore common to all programs on most UNIX-like operating systems. It cannot exceed the OS limit of ARG_MAX and if it tries to do so the error "Argument list too long" is returned to the shell and the shell returns it to you.
This is not a bug in 'mv' or other utilities nor is it a bug in 'bash' or any other shell. It is an architecture limitation of UNIX-like operating systems. The 'mv' program was prevented by the OS from running and the shell is just the one in the middle reporting the problem. The shell tried to load the program but the OS ran out of space. However, this problem is one that is easily worked around using the supplied utilities. Please review the documentation on 'find' and 'xargs' for one possible combination of programs that work well.
You might think about increasing the value of ARG_MAX but I advise against it. Any limit, even if large, is still a limit. As long as it exists then it should be worked around for robust script operation. On the command line most of us ignore it unless we exceed it at which time we fall back to more robust methods.
Here is an example using chmod where exceeding ARG_MAX argument length is avoided.
find htdocs -name '*.html' -print0 | xargs -0 chmod a+r
Read the previous question for another facet of this problem.
This frequently occurs with a question such as, I have found a bug in the sleep() function, or, I am trying to use the chown() function in my program. That may well be true. But those library functions are not the same as the command line programs. The command line program in the GNU utilities are also C program and just like your program are using the same C library and operating system calls. But we can't help you with your problem since that code is in the system's C library and not in any of the utility packages.
Sometimes the same name is used for both a program and a system call.
In fact most of the programs got their names because that was what the
system call was named. One caused the other. Which means that it is
very likely that the sleep program was named because it used
sleep library routine.
This usually happens with
or one of the other programs that have a name matching a C programming
routine. It can be confusing to realize that the program is a wrapper
to the underlying library routine or system call. If you are compiling
a program then you are NOT using the GNU command line utility
in your C/C++ program. You could be using the GNU command line utility
in your shell script, however.
You may be confusing the C library routine used in your program with the shell command line program. They are not related. Except that the program calls the library function just like your program does. If you have a real bug in the library routine then you need to determine who supplied that library and report the bug to them. If it is the GNU Lib C, a.k.a. glibc, provided library then the glibc mailing list at <email@example.com> might be a useful future reference.
The online man pages can also be confusing in regard to this. If you are trying to use chown() in a program and get a man page you will likely get the man page for the command line program and not the library routine you were looking for. You must specify the section of the manual that you want information.
man chown man 2 chown man sleep man 3 sleep
Traditionally man pages are organized by section number. Section 1 are command line programs. Section 2 are operating system calls, aka kernel calls. Section 3 are library routines. Library routines may or may not use system calls. Etc. Other sections are used for other purposes that are not germane here.
Many GNU programs have deprecated the man pages and have moved entirely to info documentation. But man pages still have a loyal following and are quite useful for reference. But they make poor user guides.
This is frequently a variant of the previous question. Go read it first and then come back here. This time the same name is used by a utility and is also built into a command line shell. In which case you might need to report your problem to your shell maintainers.
Many commands are built into your command shell so as to speed up shell
scripts. People frequently get confused over whether they are running
the external program or the internal shell built in. This usually
and some others that exist both as internal and external commands.
Double check that the problem you are having is with the external
program. If it is the internal version then contemplate reporting it to
Many times a program is required by circumstances to exist both as a builtin and as an external. Usually this is because of the need to be exec'd directly from another program without a shell to implement it as a builtin. Therefore it must exist as an external, standalone utility. However, those utilities are also built into most shells in order to improve the speed performance.
This one question arises almost more often than any other. It is due to a very popular Linux distribution setting LANG=en_US in your user environment without your knowledge. At that point sort appears broken. But once specifically requested by LANG and others like LC_* variables, sort and other locale knowledgeable programs must respect that setting and sort according to the operating system locale tables.
Of course those tables are a blessing to non-english speaking computer users. Many languages use non-ASCII fonts and character sets. The POSIX standards and the GNU utilities support those by using the installed system library sorting routines. By using them languages can specify a specific ordering. This can be done by a translator well after the program has been written and by translators not known by the program author. It is a dynamic change to the program. However, when those tables are incorrect it can also break a perfectly correct program. When locale tables are broken this is most noticeable with the sort command so it bears the full force of the problem.
Here is the standard mailing list reply on this topic.
This is due to the fact that you or your vendor have set environment variables that direct the program to use locale specific sorting tables which do not sort as you expect. You or your vendor have probably set environment variables like LANG, LC_ALL, or LANG to en_US. There appears to be a problem with that table on some systems which is not part of the GNU program but part of your vendor's system release.
Unset them, and then set LC_ALL to POSIX.
# If you use bash or some other Bourne-based shell, export LC_ALL=POSIX # If you use a C-shell, setenv LC_ALL POSIX
and it will then work the way you expect because it will use a different set of tables.
See the standards documentation for more information on the locale variables with regards to sort.
This is just a variant of the previous question. Any program that is compliant with the standards and implements locale based collating sequences to support non-ASCII languages will be affected.
See the standards documentation for more information on the locale variables with regards to ls.
If you are using date version 2.0 or earlier that is certainly possible. A large number of bug fixes and improvements went into the sh-utils-2.0.11 around October 2000. Please install a newer version of the code.
Symbolic links are created with ln -s. That creates a name redirection. When a symlink is accessed the filesystem will take the contents of the symlink as a redirection to another file, where the process may recursively be continued many times. If you were meaning "ln -s a /c/b" then that would create /c/b which would be a relative symlink to file "/c/a". If /c/a did not exist then this would be dangling until such time at that file was created. The owner, group and mode of a symlink are not significant to file access through it.
Symbolic links may use either absolute or relative paths but there are trade offs. Generally I advocate making only relative links so that the location is network independent and will work desirably across NFS mounted filesystems.
The owner, group, and permissions of a symlink are not in any way significant. Only the value of the symlink is meaningful. Regardless of that some operating systems will allow you change the owner, group or mode of a symlink and other operating systems will not. Do not worry about it as it does not matter in any case.
It means that your version of the utilities were not compiled with large file support enabled. The GNU utilities do support large files if they are compiled to do so. You may want to compile them up again and make sure that large file support is enabled. This support is automatically configured by autoconf on most systems. But it is possible that on your particular system it could not determine how to do that and therefore autoconf concluded that your system did not support large files.
The message "Value too large for defined data type" is a system error message reported when an operation on a large file is attempted using a non-large file data type. Large files are defined as anything larger than a signed 32-bit integer, or stated differently, larger than 2GB.
Many system calls that deal with files return values in a "long int" data type. On 32-bit hardware a long int is 32-bits and therefore this imposes a 2GB limit on the size of files. When this was invented that was HUGE and it was hard to conceive of needing anything that large. Time has passed and files can be much larger today. On native 64-bit systems the file size limit is usually 2GB * 2GB. Which we will again think is huge.
You may see that on a 32-bit system with a 32-bit "long int" you can't make it any bigger. At least not and maintain compatibility with previous programs. Changing that would break many things! But many systems make it is possible to switch into a new program mode which rewrites all of the file operations into a 64-bit program model. Instead of "long" they use a new data type called "off_t" which is constructed to be 64-bits in size. Program source code must be written to use the off_t data type instead of the long data type. This is typically done by defining -D_FILE_OFFSET_BITS=64 or some such. It is system dependent. Once done and once switched into this new mode most programs will support large files just fine.
See the next question if you have inadvertently created a large file and now need some way to deal with it.
I created a file with tar cvf backup.tar. Trying to "rm" this file this is not possible. The error message is:
rm: cannot remove `backup.tar': Value too large for defined data
What could I do to remove that file ?
Sometimes one utility such as tar will be compiled with large file support while another utility like rm will be compiled without. It happens. Which means you might find yourself with a large file created by one utility but unable to work with it with another.
At this point we need to be clever. Find a utility that can operate on a large file and use it to truncate the file. Here are several examples of how to work around this problem. Of course in a perfect world you would recompile the utilities to support large files and not worry about needing a workaround.
This example again requires perl to be configured for large files.
perl -e 'unlink("backup.tar");'
So let's try to hit it more directly. Truncate the file first. That will make it small and then you can remove it. The shell will do this when redirecting the output of commands.
true > backup.tar rm backup.tar
However, if your shell was not compiled for large files then the redirection will fail. In that case we have to resort to more subtle methods. Since tar created the file then tar must be configured to support large files. Use that to your advantage to truncate the file.
tar cvf backup.tar /dev/null
That is the required behavior depending upon the endianess of the underlying machine. You are probably mixing up words and bytes.
The -x option outputs short integers, note integers which is a word and not a byte, in the machine's short integer format. If you are operating on a little endian machine such as an x86 then the bytes appear in 'backwords' order. Here is what the –help says.
`-x' Output as hexadecimal shorts. Equivalent to `-tx2'.
If you require a specific byte ordering, note bytes not words, then you need to supply a byte specification such as 'od -t x1'.
echo abcdefgh > /tmp/letters od -cx /tmp/letters 0000000 a b c d e f g h \n \0 6261 6463 6665 6867 000a 0000011 od -t cx1 /tmp/letters 0000000 a b c d e f g h \n 61 62 63 64 65 66 67 68 0a 0000011
If you search the web for “little endian big endian” you should turn up many hits for various documentation on this subject. But for sure you should read “On Holy Wars and a Plea for Peace” written by Danny Cohen and published in IEEE Computer years ago as it is a classic treatise on the subject.
The expr program appears to be broken because expr 2 * 3 produces a syntax error.
expr 2 * 3 expr: syntax error
The case shown is not quoted correctly. As such it provides a good example of incorrectly quoting shell metacharacters. The “*” is being expanded by the shell and the expr is seeing filenames creating the syntax error.
The “*” character is a special character to the command shell. It is called a glob character because the shell expands it to match filenames. It matches a glob of characters. The shell then passes the result to the command. The expr program does not see the star character in “2 * 3” but instead sees a “2” followed by every filename matched from the current directory and then finally the “3”.
Your command is really something completely different than you thought it was going to be. Use echo to see what your command line really is:
echo expr 2 * 3
You will see it matching filenames and so create the syntax error.
There are many entries in this FAQ related to command line “*” glob expansion. If you are having trouble with this concept then read through the other entries for a fresh perspective.
You need to quote shell metacharacters to prevent them from being modified by the shell. Here are some possible ways of quoting this example.
expr 2 \* 3 expr 2 '*' 3 expr 2 "*" 3
The man page for expr carries this text:
Beware that many operators need to be escaped or quoted for shells.