Satellite images are important information sources of the earth environment. Automatic classification of satellite images has always been an important research topic. With the recent advancement of deep learning, the convolutional neural network (CNN) approach has shown great potential in object detection in high resolution images. However, insufficient labelled samples and constrained input image sizes have limited the wide application of CNN for remote sensing. In this study, the Hierarchical Active Learning (HAL) framework is proposed by incorporating transfer learning, tile map service (TMS), and active learning to enable effective scene classification with very few manually labelled samples. A case study of vehicle detection with HAL has been conducted and shows that HAL can achieve the accuracy of more than 80% with only 50 training samples for a large area. Moreover, its ability to extend to incorporation of different TMS image sources and CNN models makes it useful for various object detection tasks.