CSE 455 Computer Vision
CSE455: Computer Vision - Spring 2018
I saw this course on pjreddie's GitHub page, and found it intersting.👍
It is an undergraduate course provided by School of Computer Science and Engineering at University of Washington. I did the assignment for my personal interest.😋
Solution of Assignments📁
My solution to the Assignments includes codes to finish the homework and extra things to get the credits.
Style Things📗
- ‘for’ loop initial declarations are only allowed in C99 mode
Although i could use-std=c99
flag to tell the complier to use the C99, i think it's cooler to do the declaration out of the loop.int i, j, k; for (i = 0; i < im.c; ++i){ for (j = 0; j < im.h; ++j){ for (k = 0; k < im.w; ++k){ /*body*/ } } }
- If statement
My obsession:if(expression)
for single-line things, andif (expression){
for multiple lines. And always useif(1)
orif(0)
to enable/disable code snippet.if(!sum) return;
if (a == LOGISTIC){ d.data[i][j] *= x * (1 - x); } else if (a == RELU){ d.data[i][j] *= x > 0 ? 1 : 0; } else if (a == LRELU){ d.data[i][j] *= x > 0 ? 1 : 0.1; }
So i can search forif(0){ /*disabled body*/ } else { /*enabled body*/ }
if(0)
to locate the snippet and do the switch quickly?
Actually, i was not stick to this norm in this repository.😂 - Always use
++i
when i have a choice
Note📝
-
Makefile
TODO Should write a gist for it. -
Complie with opencv(using MinGW)
-
struct with pointer inside it
When we define a struct with at least one pointer in it.typedef struct matrix{ int rows, cols; double **data; int shallow; } matrix;
We should write a function to allocate and initialize memory for it for safety amd convenience:
matrix make_matrix(int rows, int cols) { matrix m; m.rows = rows; m.cols = cols; m.shallow = 0; m.data = calloc(m.rows, sizeof(double *)); int i; for(i = 0; i < m.rows; ++i) m.data[i] = calloc(m.cols, sizeof(double)); return m; }
And also a function to free the memory:
void free_matrix(matrix m) { if (m.data) { int i; if (!m.shallow) for(i = 0; i < m.rows; ++i) free(m.data[i]); free(m.data); } }
Remember to call it to free the memory manually to avoid ⚠️segmentation fault.
And also a funtion for deep copy(if necessary).matrix copy_matrix(matrix m) { int i,j; matrix c = make_matrix(m.rows, m.cols); for(i = 0; i < m.rows; ++i){ for(j = 0; j < m.cols; ++j){ c.data[i][j] = m.data[i][j]; } } return c; }
-
Never use struct with pointer inside it as intermediate varible in the expression
in./vision-hw4/src/classifier.c
i used to write things like this.// THIS IS TOTALLY WRONG! matrix backward_layer(layer *l, matrix delta) { // back propagation through the activation gradient_matrix(l->out, l->activation, delta); // calculate dL/dw and save it in l->dw free_matrix(l->dw); matrix dw = matrix_mult_matrix(transpose_matrix(l->in), delta); l->dw = dw; // calculate dL/dx and return it. matrix dx = matrix_mult_matrix(delta, transpose_matrix(l->w)); return dx; }
It is totally wrong because the intermediate struct variable
transpose_matrix(l->in)
andtranspose_matrix(l->w)
will never ever be freed. And this stupid Python-like convenient writing will run out of ur memory. And fill it up with these intermediate garbage. Finally throw out a ⚠️segmentation fault.
The right way to do this is:matrix backward_layer(layer *l, matrix delta) { // back propagation through the activation gradient_matrix(l->out, l->activation, delta); // calculate dL/dw and save it in l->dw free_matrix(l->dw); matrix inT = transpose_matrix(l->in); matrix dw = matrix_mult_matrix(inT, delta); free_matrix(inT); l->dw = dw; // calculate dL/dx and return it. matrix wT = transpose_matrix(l->w); matrix dx = matrix_mult_matrix(delta, wT); free_matrix(wT); return dx; }
-
String things could cause fatal mistake
After finishing my code in./vision-hw4
, i trained the model on my windows laptop and it worked well. But when i tried to use Linux to do the same thing, the training procedure just get crashed which gave me a 0% training and test accuracy.
After debugging i found that i accidentally changed the line ending(of the filemnist.labels
) fromLF
toCRLF
, which is default on Windows.
This converts all\n
(represents line break on Linux) to\r\n
(represents line break on Windows).
Sonum0\n
becomesnum0\r\n
inmnist.labels
, so does the rest.
And seechar *fgetl(FILE *fp)
function in./src/data.c
. This function parses labels from the text file and stores labels for training and test phase.char *fgetl(FILE *fp) { if(feof(fp)) return 0; size_t size = 512; char *line = malloc(size*sizeof(char)); if(!fgets(line, size, fp)){ free(line); return 0; } size_t curr = strlen(line); while((line[curr-1] != '\n') && !feof(fp)){ if(curr == size-1){ size *= 2; line = realloc(line, size*sizeof(char)); if(!line) { fprintf(stderr, "malloc failed %ld\n", size); exit(0); } } size_t readsize = size-curr; if(readsize > INT_MAX) readsize = INT_MAX-1; fgets(&line[curr], readsize, fp); curr = strlen(line); } if(line[curr-1] == '\n') line[curr-1] = '\0'; return line; }
And most importantly, this function looks for
\n
as a marker of line ending. So labelnum0
becomesnum0\r
, so does the other labels.
At the training phase, all the training samples will be considered as negative so does the test phase. Surprisingly but reasonablely, i got 0% for both training and test accuracy.
Remember:LF
as a default option- Make string function compatible with both Linux and Windows
-
More Extra Credit of vision-hw2(spherical coordinates)
Resources📚
- Text Book: Computer Vision: Algorithms and Applications Rick Szeliski, 2010.
- My solution: ivanpp/CSE455_Spring_2018