Predicting Students' Dropout and Academic Success via Classification
I used KNN, SVM, SGDClassifier and Random Forest to classify student data in order to identify who will graduate, drop out or stay enrolled. The report of this project can be found at https://github.com/betulmesci/classification-of-students/blob/main/Classification_final.pdf
Customer Segmentation using KMeans
In this project, I utilized customer data from a store and categorized customers based on their personal characteristics and shopping behaviors. At the conclusion of the project, statistical and verbal descriptions were provided for each cluster.
Image denoising using TOMOGANS
In this project we used GANs (specifically TOMOGAN) to denoise three noisy images. We had much better results than CNNs.
Image Classification using Convolutional Neural Networks
In this project, house numbers' digits obtained from Google maps images were classified into categories (from 0 to 9) using Convolutional Neural Networks. The accuracy of the testing set was 84%.
Cluster Validity
In this project, I discussed Partition Coefficient, Classification Entropy, and CS indices as clustering validity methods.
Classifying Emails into Correct Newsgroups
In this project, around 18,000 emails were classified into newsgroups based on their content using several supervised machine learning techniques. The best accuracy was provided by Logistic Regression at 89% accuracy.
Image Classification with KNeighbors and Support Vector Machine Algorithms
In this project, I used K Nearest Neighbors (KNN) and Support Vector Machine (SVM) Algorithms to classify house numbers obtained from Google Street View photos dataset. Both models did fairly well with KNN providing an accuracy of 65% and SVM providing an accuracy of 66% after some extensive image preprocessing.
Image Processing Tutorial Using scikit-image - Contour Detection
In image processing, finding contours of objects can be helpful in object detection. As an exercise, I used scikit-image to detect the contours of holes of a slice of cheese and counted them.
Image Processing Tutorial Using scikit-image - Thresholding
In this section of the tutorial, I explored various thresholding techniques. I measured and compared the bacterial growth in several petri dishes using masking and thresholding techniques as an exercise.
Image Processing Tutorial Using scikit-image - Noise
In this part, Gaussian (random) noise and adding and removing different kinds of noise to the images are discussed.
Image Processing Tutorial Using scikit-image - Masking
In this section, I went over masking images to show only the relevant parts. As an exercise, I used a for loop to show hamburgers in a repetitive image and black out the rest.
An Analysis of DUI's Issued by Police in Ames, Iowa
Ames, Iowa is a college town. A dataset of DUI's issued in this town is provided. I explored this dataset to see who mostly got the tickets and whether Ames being a college town affected any increase in the number of tickets issued.
Market Basket Analysis
Using manual computations and Apriori algorithm, I calculated support, confidence and lift of more than 99,000 transactions. I, then, found ten best association rules using these metrics.
Amazon Product Recommender Based on Customer Reviews and Product Descriptions
In this project, I developed two recommendation systems for Amazon products based on customer reviews and product descriptions, using cosine similarity and word embeddings.
Image denoising using Convolutional Neural Networks
In this project three noisy images were denoised using Convolutional Neural Networks
Edge Detection
In this project, Roberts, Sobel and Prewitt filters were implemented from scratch and applied to images to detect edges of the objects.
Fuzzy C Means Clustering
In this project I used Fuzzy C Means to determine the clusters in several datasets which are in image form.
Investment Analysis
In this project, data for two cab companies were provided. I carried out an analysis to determine investing in which cab company is more profitable.
Image Processing Tutorial Using scikit-image - Image Restoration
In this section of the tutorial, I used inpaint_biharmonic function from scikit-image to restore the torn out parts of a severely damaged photo of a person.
Image Processing Tutorial Using scikit-image - Contrast Enhancement
In this section, I talked about Standard and Adaptive Histogram Equalization methods to improve the details of low-contrast images. As an exercise, I used these techniques to enhance the quality of a Covid patient's lung x-ray image.
Image Processing Tutorial Using scikit-image - Image Segmentation
In this section, I talked about two kinds of image segmentation techniques: Supervised and Unsupervised. For Supervised method, I explored Active Contour and Random Walker methods to detect the face of a person. For Unsupervised method, Simple Linear Iterative Clustering (SLIC) and Felzenszwalb Clustering techniques were discussed.
Image Processing Tutorial Using scikit-image - Histograms
In this section, I explored creating histograms of grayscale and colored images with Numpy and PyPlot. I discussed a drawback of histograms and how we can overcome this by using Kernel Density Estimation (KDE).
Image Processing Tutorial Using scikit-image - Basic Operations on Images
This tutorial uses scikit-image library to process images. This is the introductory section where basic operations like loading and transforming images are presented.
How does a Movie's Rating Affect its Revenue?
In this project, I explored whether there is a correlation between a movie's rating and the revenue it brings in by applying Linear Regression to a dataset provided by the Disney Company.
Predicting Stock Prices of Home Depot Based on Trends and the Sentiment of News and Tweets
In this project, I applied Linear Regression and CNN+LSTM on Home Depot's historical stock price data. I gathered tweets and Google news results related to Home Depot and ran sentiment analysis on them. I also incorporated the number of Google searches of Home Depot (trends). I applied Linear Regression and CNN+LSTM on the historical + sentiment + trends data. Finally, I applied K-Means clustering and incorporated that information in the data as well and ran the same analysis. The models, especially CNN+LSTM, could predict the target variable almost perfectly.
Image Segmentation using Several Methods
In this project image segmentation using K-Means Clustering, Contour Detection, Thresholding and Color Masking was discussed.
Sampling and Quantization of Images
Sampling and Quantization of Images are discussed in detail.
Inspecting Clustering Tendency of a Dataset using Visual Assessment Tendency
In this project I discussed how Visual Assessment Tendency can be used to determine the clustering tendencies of several datasets which are in image form.
Dimensionality Reduction using PCA and LDA
In this project, Principal Component Analysis and Linear Discriminant Analysis are discussed as dimensionality reduction techniques. They are implemented from scratch on iris dataset. Specifically, PCA1, PCA2 and PCA3 are extracted and new features are projected to 1D, 2D and 3D spaces with the help of these vectors respectively. LDA is implemented with only the first component, and features are projected to 1D with the help of this vector. Conclusion: LDA is much more successful in linearly separating classes than PCA, as it is a supervised machine learning technique and already knows which data point belongs to which class, whereas PCA is an unsupervised machine learning technique and does not have that knowledge. Still, both of them can project data points to new dimensions.
Image Processing Tutorial Using scikit-image - Edge & Corner Detection
In this part, I talked about edge and corner detection of the objects. I used Sobel, Prewitt, Scharr and Canny filters to detect edges. For corners, I used corner_harris() from skimage.feature to detect and count the corners of the objects in an image.
Image Processing Tutorial Using scikit-image - Connected Component Analysis
In this section, I talked about counting and labeling objects using Connected Component Analysis algorithm. As an exercise, I counted objects in an image and removed the ones that have smaller areas than a certain value.
Image Processing Tutorial Using scikit-image - Morphological Operations
In this part, I went over erosion, dilation, opening and closing functions. As an exercise, I used an image of touching coins and had skimage count the number of coins automatically.
Image Processing Tutorial Using scikit-image - Manipulating Image Pixels
In this section, I explored how to set pixel values to a certain value based on a threshold value and do calculations on pixel values. As an exercise, I calculated average red, green and blue color values of certain pixels which may be useful in determining color change over time in a titration experiment.
Web Scraping with BeautifulSoup
In this project I scraped a Children's Story Website with BeautifulSoup library. The goal was to capture 'Story Titles' and 'Story Descriptions', clean and present them in a DataFrame.