Facial expressions play a significant role in describing the emotions of a person. Due to its applicability to a wide range of applications, such as human-computer interaction, driver status monitoring, etc. Facial Expression Recognition (FER) has received substantial attention among the researchers. According to the earlier studies, a small feature set is used for the extraction of facial features for the FER system. To date, a systematic comparison of facial features does not exist. Therefore, in the current research, we identified 18 different facial features (cardinality of 46,352) by reviewing 25 studies and implemented them on the publicly available Extended-Cohn-Kanade (CK+) dataset. After extracting facial features, we performed Feature Selection (FS) using Joint Mutual Information (JMI), Conditional Mutual Information Maximization (CMIM) and MaxRelevance Min-Redundancy (MRMR) and explain the systematic comparison between them, and for classification, we applied various machine learning techniques. The Bag of Visual Words (BoVW) model approach results in significantly higher classification accuracy over the formal approach. Also, we found that the optimal classification accuracy for FER can be obtained by using only 20% of the total identified features. Grey comatrix and haralick features were explored for the first time for the FER and grey comatrix features outperformed several most commonly used features Local Binary Pattern (LBP) and Active Appearance Model (AAM). Histogram of Gradients (HOG) turns out to be the most significant feature for FER followed by Local Directional Positional Pattern (LDSP) and grey comatrix.