Human activity recognition (HAR) has many important applications in health care. While machine learning-based techniques have been applied for wearable sensor-based HAR, very few researchers have comprehensively studied the effects of various factors on the accuracy and robustness of activity classification. This paper presents a detailed empirical study of machine learning-based HAR schemes. The objective is to improve human activity recognition based on techniques that do not increase the computational overheads. We describe and evaluate techniques for feature extraction, feature selection and classification. We perform series of experiments on a dataset consisting of readings from hip worn sensors of 77 subjects of varying ages performing several ambulatory and non-ambulatory activities. Through these experiments, we show that frequency domain analysis of accelerometer readings reveals useful information about human activity patterns and combining frequencydomain features with time domain features provides significant accuracy improvement. Our experiments find random forest algorithm to be the most accurate for HAR. We also show that dataset size of the accelerometer readings can be reduced down to 20% without a drastic reduction in classification accuracy. Furthermore, we show that age-based grouping has significant impact on classification, and age-specific training of classifiers can yield significant performance improvement.