Cluster analysis of North-West Pacific Basin tropical cyclones genesis points
The ability of cluster analysis to simplify deductions about a massive dataset is instrumental in modern data analysis. In this work, we used the k-means algorithm to cluster the genesis points of the North-West Pacific Basin (NWPB) Tropical Cyclones (TCs) to identify inherent spatial clustering and unique temporal characteristics. The dataset consisted of 1698 TC from 1945 to 2015 with the following data included: TC name, longitude, latitude, wind speed, pressure, and date of occurrence. The clustering performance metrics are the silhouette score and inertia with the k-Nearest Neighbor (kNN) used for added cluster validation. Clustering performance metrics show that the optimal number of NWPB genesis TC clusters is five with a silhouette score, inertia, and average kNN accuracy of 0.42, 75412.60 and 0.989, respectively. Cluster temporal analysis shows a shift in TC genesis formation location as previously shown in the literature. This study demonstrates how ML technique complements traditional methods in analyzing climate variables.