WEEKLY PROGRESS REPORT (WPR) For the week commencing: 12-05-2014
WPR: 1 Enrollment Number: A2305212094 Program: BTech CSE Student Name : Aneesh Devasthale Faculty Guides Name: Mr. Deepak Gaur Project Title: Image Processing on the GPU
TARGETS SET FOR THE WEEK 1. Download and install the CUDA toolkit. Configure it to run with Visual Studio. 2. Find out more about the CUDA toolkit, its associated terminologies and commonly used functions. 3. Run sample test code on the GPU. 4. Define a suitable data structure to store an image. 5. Develop functions for reading, writing and displaying an image. ACHIEVEMENTS FOR THE WEEK 1. Downloaded and installed NVIDIA CUDA Toolkit 6.0. I already had Visual Studio Professional 2013 installed. However, the CUDA toolkit doesnt support the VS2013 C++ Compiler (VC12), so I installed the Visual Studio Express for Desktop 2012 which had the VC11 C++ Compiler. Steps for configuring Visual Studio: Add CUDA 6.0 to Build Customizations Change the platform toolset to Visual Studio 2012 (v110). Reference the CUDA include directories and CUDA additional directories Add cudart.lib to the Additional Dependencies of the Linker. 2. Completed parts of an online course on GPU computing on Udacity.com (https://www.udacity.com/course/viewer#!/c-cs344/l-55120467/m-65830481). Understood the difference between host and device. Anything that runs on the CPU is running on the host and anything that is running on the GPU is running on the device. The C++ code is compiled using the NVCC compiler which generates both host (using VC++) and device object code. A typical CUDA source file (.cu file) contains a special function called the kernel. The kernel specifies the task to be done by each thread that is run. The kernel is invoked from the host code along with the number of threads to start. The real power of the GPU lies in the huge number of threads it can start. For example the NVIDIA GEFORCE GT 650m in my machine has the following properties. Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) 3. I was able to run build and debug the samples provided with the toolkit successfully. 4. To store an image, I decided to use the simplest data structure, a 2-dimensional array of integers. However, I have used a 1-dimensional array instead with data stored in row- major form. This is because it is not possible to pass 2-D arrays to a CUDA kernel. The file type corresponding to this data type is the PGM format. PGM is an acronym for Portable Grey Map. It is a standard bitmap based format consisting of a 4 lines header, and data stored in the unsigned char type, providing a maximum of 256 gray scale levels or 8-bit data per pixel. [4] It is one of the simplest image formats in which a value of 0-255 for each pixel in the image is given along with a few headers. This project uses the PGM format for input and output images. A sample PGM file: 1. P2 2. # Shows the word "FEEP" (example from Netpbm main page on PGM) 3. 24 7 4. 15 5. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6. 0 3 3 3 3 0 0 7 7 7 7 0 0 11 11 11 11 0 0 15 15 15 15 0 7. 0 3 0 0 0 0 0 7 0 0 0 0 0 11 0 0 0 0 0 15 0 0 15 0 8. 0 3 3 3 0 0 0 7 7 7 0 0 0 11 11 11 0 0 0 15 15 15 15 0 9. 0 3 0 0 0 0 0 7 0 0 0 0 0 11 0 0 0 0 0 15 0 0 0 0 10. 0 3 0 0 0 0 0 7 7 7 7 0 0 11 11 11 11 0 0 15 0 0 0 0 11. 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
(Magnified) Due to the simplicity of this format, image processing algorithms can be easily demonstrated by adjusting the value of each pixel. For example, to increase the brightness of this image by 1 unit, the value of each pixel can be simply incremented by 1. Similarly, to darken the image, the value of each pixel is decremented by 1. However, care must be taken to keep the value of each pixel under the maximum grey level which is specified in the 3 rd line of the header. (It is 15 in the example file).
Image data structure I encapsulated the array in a Image struct along with some other required values. 1. struct Image 2. { 3. int N; // number of rows 4. int M; // number of columns 5. int Q; // number of gray levels 6. int *pixelVal; //2D array 7. };
5. Developed simple functions in C++ to read and write an image. Code to read a PGM file 1. int readImage (char fname[], Image& image) 2. { 3. int i, j; 4. int N, M, Q; 5. unsigned char *charImage; 6. char header [100], *ptr; 7. 8. std::ifstream ifp; 9. ifp.open (fname, std::ios::in | std::ios::binary); 10. 11. if ( !ifp ) //error checking 12. { 13. std::cout << "Can't read image: " << fname << std::endl; 14. exit (1); 15. } 16. 17. // read header 18. 19. ifp.getline (header, 100, '\n'); //magic number 20. if ( ( header [0] != 80 ) || ( header [1] != 53 ) ) //if not P5 21. { 22. std::cout << "Image " << fname << " is not PGM" << std::endl; 23. exit (1); 24. } 25. 26. ifp.getline (header, 100, '\n'); 27. while ( header [0] == '#' ) //file name line in file starts with # 28. ifp.getline (header, 100, '\n'); 29. 30. M = strtol (header, &ptr, 0); //number of colums 31. N = atoi (ptr); //number of rows 32. 33. ifp.getline (header, 100, '\n'); 34. Q = strtol (header, &ptr, 0); //max gray value 35. 36. charImage = ( unsigned char * ) new unsigned char [M*N]; //creates 2D array 37. 38. ifp.read (reinterpret_cast<char *>( charImage ), ( M*N )*sizeof (unsigned char)); //reads in 2D array 39. 40. if ( ifp.fail () ) 41. { 42. std::cout << "Image " << fname << " has wrong size" << std::endl; 43. exit (1); 44. } 45. 46. ifp.close (); 47. 48. // Convert the unsigned characters to integers 49. 50. int val; 51. 52. for ( i = 0; i<N; i++ ) 53. for ( j = 0; j<M; j++ ) 54. { 55. val = (int)charImage [i*M + j]; 56. image.pixelVal [i*M + j] = val; 57. } 58. cout << "\nImage Read successfully:"<<fname<<"\n"; 59. delete[] charImage; 60. 61. return ( 1 ); 62. }
Code to write to a PGM file 1. int writeImage (char fname[], Image& image) 2. { 3. int i, j; 4. int N, M, Q; 5. unsigned char *charImage; 6. ofstream ofp; 7. 8. N = image.N; 9. M = image.M; 10. Q = image.Q; 11. 12. charImage = ( unsigned char * ) new unsigned char [M*N]; 13. 14. // convert the integer values to unsigned char 15. 16. int val; 17. 18. for ( i = 0; i<N; i++ ) 19. { 20. for ( j = 0; j<M; j++ ) 21. { 22. val = image.pixelVal [i*M + j]; 23. charImage [i*M + j] = (unsigned char)val; 24. } 25. } 26. 27. ofp.open (fname, ios::out | ios::binary); 28. 29. if ( !ofp ) 30. { 31. cout << "Can't open file: " << fname << endl; 32. exit (1); 33. } 34. 35. ofp << "P5" << endl; 36. ofp << M << " " << N << endl; 37. ofp << Q << endl; 38. 39. ofp.write (reinterpret_cast<char *>( charImage ), ( M*N )*sizeof (unsigned char));
40. 41. if ( ofp.fail () ) 42. { 43. cout << "Can't write image " << fname << endl; 44. exit (0); 45. } 46. 47. ofp.close (); 48. 49. delete[] charImage; 50. cout << "\nOutput image written successfully: " << fname<<endl; 51. return( 1 ); 52. 53. } Code sourced from the public domain. Can be found here: http://www.dreamincode.net/forums/topic/76816-image-processing-tutorial/
FUTURE WORK PLANS 1. Implement basic image processing algorithms like colour inversion, negative and slightly more complex algorithms like blur to run on the CPU. 2. Attempt to implement the same algorithm to run on the GPU.