Computing easy with GPU

GPU COMPUTING
The CPU (Central Processing Unit) has been referred to as the brain of the computer, performing the general computational and logical operations within a computer. These units are performs faster with general computational problems and algorithms, but when it comes to complex, floating point arithmetic calculations and algorithms, they are slower. These types of complex calculations arise mostly when it is required to process graphical images, videos and other media files. GPU’s (Graphical Processing Units) have been included in general purpose computer systems in order to assist the GPU to perform these complex calculations, thus GPU’s have become an integral part of the general purpose computer systems and over the past few years there have been a considerable increase in the performance of the GPU’s. In present day computer systems these advanced capabilities of the GPU’s have been employed to render high definition videos and to render high quality gaming experience. This advanced capability of a GPU which includes advanced computational capabilities and programmability has gained the interest of these GPU being used for other computational purposes to reduce the workload of the processor. These efforts to introduce GPU into general purpose computing are known as GPU computing.
GPU’s are designed for a particular class of applications that include high computational requirements, in fields where the parallelism is substantial and in fields where throughput is more important than latency. Architecturally GPU contain many cores than the CPU and thus it can handle more threads than the CPU and thereby the computational capability of the GPU will be more than ordinary CPU and it can handle more threads than the CPU and thus more number of instructions could be executed within a given time in a GPU. Even though GPU computing has many advantages there exist many problems while migrating the GPU for general purpose computing. These include handling the pipelining tasks of the GPU. The parallelism of the GPU processing cores is to increase the throughput and not the latency which is required for graphics API’s.
To execute a pipeline, a CPU would take a single element (or group of elements) and process the first stage in the pipeline, then the next stage, and so on. The CPU divides the pipeline in time, applying all resources in the processor to each stage in turn. GPUs have a different approach; the GPU divides the resources of the processor among the different stages, such that the pipeline is divided in space, not time. The part of the processor working on one stage feeds its output directly into a different part that works on the next stage. The main advantage of using such a pipelining architecture in the GPU is that the hardware in any given stage could exploit data parallelism within that stage, processing multiple elements at the same time enabling the GPU to meet large computational needs and each stage’s hardware could be customized with special-purpose hardware for its given task, allowing substantially greater computational and area efficiency over a general-purpose solution. The major disadvantage of this GPU computing is load balancing. Like any other pipelining the performance of the GPU pipelining depends on its slowest stage. If the vertex program is complex and the fragment program is simple, the effective throughput of the GPU will depend on the performance of the vertex program.