Monday, January 13, 2014

Garbage collection (computer science)

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory occupied by objects that are no longer in use by the program. Garbage collection was invented by John McCarthy around 1959 to solve problems in Lisp.

Garbage collection is often portrayed as the opposite of manual memory management, which requires the programmer to specify which objects to deallocate and return to the memory system. However, many systems use a combination of approaches, including other techniques such as stack allocation and region inference.

Garbage collection does not traditionally manage limited resources other than memory that typical programs use, such as network sockets, database handles, user interaction windows, and file and device descriptors. Methods used to manage such resources, particularly destructors, may suffice as well to manage memory, leaving no need for GC. Some GC systems allow such other resources to be associated with a region of memory that, when collected, causes the other resource to be reclaimed; this is called finalization.

Finalization may introduce complications limiting its usability, such as intolerable latency between disuse and reclaim of especially limited resources, or a lack of control over which thread performs the work of reclaiming.

Principles


The basic principles of garbage collection are:


  •     Find data objects in a program that cannot be accessed in the future
  •     Reclaim the resources used by those objects

Many computer languages require garbage collection, either as part of the language specification (e.g., Java, C#, and most scripting languages) or effectively for practical implementation (e.g., formal languages like lambda calculus); these are said to be garbage collected languages. Other languages were designed for use with manual memory management, but have garbage collected implementations available (e.g., C, C++). Some languages, like Ada, Modula-3, and C++/CLI allow both garbage collection and manual memory management to co-exist in the same application by using separate heaps for collected and manually managed objects; others, like D, are garbage collected but allow the user to manually delete objects and also entirely disable garbage collection when speed is required. While integrating garbage collection into the language's compiler and runtime system enables a much wider choice of methods, post hoc GC systems exist, including some that do not require recompilation. (Post-hoc GC is sometimes distinguished as litter collection.) The garbage collector will almost always be closely integrated with the memory allocator.

Benefits


Garbage collection frees the programmer from manually dealing with memory deallocation. As a result, certain categories of bugs are eliminated or substantially reduced:
  •     Dangling pointer bugs, which occur when a piece of memory is freed while there are still pointers to it, and one of those pointers is dereferenced. By then the memory may have been re-assigned to another use, with unpredictable results.
  •     Double free bugs, which occur when the program tries to free a region of memory that has already been freed, and perhaps already been allocated again.
  •     Certain kinds of memory leaks, in which a program fails to free memory occupied by objects that have actually become unreachable, which leads to memory exhaustion if such a behavior is repeated indefinitely. (Garbage collection typically does not deal with the unbounded accumulation of data which is reachable, but which will actually not be used by the program.)

Some of the bugs addressed by garbage collection can have security implications.

Disadvantages


Typically, garbage collection has certain disadvantages:
  •     Garbage collection consumes computing resources in deciding which memory to free, reconstructing facts that may have been known to the programmer. The penalty for the convenience of not annotating object lifetime manually in the source code is overhead, often[citation needed] leading to decreased or uneven performance. Interaction with memory hierarchy effects can make this overhead intolerable in circumstances that are hard to predict or to detect in routine testing.
  •     The moment when the garbage is actually collected can be unpredictable, resulting in stalls scattered throughout a session. Unpredictable stalls can be unacceptable in real-time environments, in transaction processing, or in interactive programs. Incremental, concurrent, and real-time garbage collectors address these problems, with varying trade-offs.
  •     Non-deterministic GC is incompatible with RAII based management of non GCed resources. As a result, the need for explicit manual resource management (release/close) for non-GCed resources becomes transitive to composition. That is: in a non-deterministic GC system, if a resource or a resource like object requires manual resource management (release/close), and this object is used as 'part of' another object, then the composed object will also become a resource like object that itself requires manual resource management (release/close).

No comments:

Post a Comment