![]()
This might be a sort of FAQ, but I don’t see why,
so I would someone help me to understand what’s wrong?
I’ve just created following code which wold trim
white space(s) in a (given) string.
But, it resulted the Segmentation fault, and so as
when running in gdb (saying “Program received signal
SIGSEGV, Segmentaion fault at *p++ = *st++”).
The platform is Linux kernel 2.4.27, gcc version
2.95.4 20011002.
/*——————————————————-*/
#include <stdio.h>
#include <ctype.h>
int main(int argc, char ** argv)
{
char *st = “Hey, how are you?”;
char *p, *s;
p = s = st;
while( *st )
{
if ( isspace( (int)*st ) )
st++;
else
*p++ = *st++;
}
*p = ‘/0’;
printf(“whitespace trimed : %s\n”, s);
see what’s wrong with copying the character in the
string to the other place (in memory) where should not
overlap the end of the string (‘/0’) of the original
string.
Colud someone help me to understand what’s wrong with
this code?
Thanks and Best Regards,
Cocy
******************************
Generally, the compiler and linker have almost complete control over
this, and merely leave instructions for the OS and dynamic linker to
follow.
> called “heap”?
During linking, this will usually be merged (along with “.init”,
“.fini”, and any number of other read-only segments) with “text”
segment into a “program header” covering all of the read-only loadable
segments. You can view this by using (on linux):
% objdump -x /path/to/binary
or, (on Solaris):
% elfdump /path/to/binary
Look for “Program Header” and “Section”, and start lining things up.
> my question. Is there any source to learn about
> these things? (please don’t say “take the class in
> school” :-) Does the assembly code tell me those
> stuffs?
1558604960) is the best book I’ve run across for these sorts of details.
You could also google for documentation on ELF (Executable Linking
Format), which is the UNIX standard format.
Cheers,
- jonathan
******************************
View profile More options Oct 15 2004, 9:22 am>… “the [contents of a] string [literal] may be stored in read-only
>memory” … [so] where is the area?, who decide the area?
>I mean does the compiler tell to someone (OS?) like
>”please let this program use this meory area as to
>be const”? or does OS decide like “Oh, you, the
>program, I’ll keep the string into the safety area
>so that you can’t modify later”?
dependent.
What happens if there is no operating system at all? In this
case, the compiler is the *only* entity involved, so it must be
the compiler that decides.
On the other hand, suppose there is a strict operating system,
in which programs — including compilers — must beg and plead,
as it were, for every resource? In this case, *only* the OS
can create read-only regions containing “precooked” data (such
as the characters in the string). The compiler can ask, but
the OS decides.
One thing is clear enough, though: the compiler has to at least
ask, in some fashion or another. Suppose the OS (assuming one
exists) is simply presented with “here is a bunch of data”, e.g.,
the contents of both of these arrays:
char modifiable[] = “hello”;
const char unmodifiable[] = “world”;
so that the OS sees an undifferentiated sequence of data:
hello\0world\0
How will this OS determine which of these is supposed to be read-only?
>”read-only memory”, or only OS knows where it is?
>(is only OS able to decide where it is)?
>called “heap”?
a data structure (see, e.g., <http://c2.com/cgi/wiki?HeapDataStructure>),
and what the C99 standard refers to as “allocated storage” — memory
managed via malloc() and free(). (The C++ standard has a different,
and I think better, term for the latter.)
There are at least three (or more, depending on how you count)
different ways that C strings are commonly implemented, depending
on OS (if any) and compiler and object-file format. None of them
are called “heap”, at least, not unless you want to confuse other
people :-) .
One method is to have, in the object file (“.o” or “.OBJ”, in many
cases) format, a section or region-type-marker called a “read-only
data area” or “read-only data segment” or something along those
lines. All read-only data is marked this way, including the contents
of string literals that are not used to initialize read/write data.
(A short name for this is “rodata” or “the rodata section”.)
Another method is to have a special “strings” section. String
literals are placed in a strings section, and identical string
literals in separate files can then be coalesced. (If string
literal contents are in ordinary rodata sections, it becomes more
difficult to merge them across separate object files — “translation
units”, in C-Standard-ese. In particular, by having a separate
“strings” section, there is no longer any need to mark particular
objects as “must be unique”. [C requires that &a != &b, even if
a and b are both const char arrays containing the same text.])
A third method is to put strings into the “text” (read-only,
code-only) section, and rely on the fact that code happens to be
readable as data on the system in question.
A fourth method is simply to allow string literals to be write-able.
In some cases, the object file format might allow for separate
read-only data and/or string sections, but the executable file
format might not. In this case, a compiler could move the rodata
back into either the text or the data (as desired).
Similarly, for OS-less systems, the final executable may be loaded
into some kind of ROM (PROM, EEPROM, flash memory, etc.). Typically
*all* text *and* data segments must be stored in some sort of
nonvolatile memory, with initialized-data copied to RAM by some
startup code. Here rodata can be left in the ROM, rather than
copied to (possibly precious) RAM (although RAM has gotten awfully
cheap — the days of shaving a few bucks off the price of a TRS-80
by leaving out one 21L02 chip are long gone…).
Note that if string literals and other rodata are in a ROM, and
the OS-less or tiny-OS system is run on a device without memory
protection, attempts to overwrite this data simply fail silently:
char *p = “hello”;
strcpy(p, “world”); /* ERROR */
printf(“result: %s\n”, p);
prints “result: hello”, because each attempt to overwrite the
contents of the ROM was completely ignored in hardware.
All that the C standard says is that attempts to overwrite string
literals produce undefined behavior. Actual behavior varies, but
tends to be one of these three: “segmentation fault - core dumped”
(or local equivalent), “attempt ignored”, or “literal overwritten”.
—
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22’N, 111°50.29’W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.




