Wide area data transfers play an important role in many science applications but rely on expensive infrastructure that often delivers disappointing performance in practice. In response, we present a systematic examination of a large set of data transfer log data (20.5 billion GridFTP STOR command logs totaling 1.5 exabytes received and 19.4 billion GridFTP RETR command logs totaling 1.8 exabytes transmitted, by a total of 63,166 GridFTP servers distributed all over the world in the past four years.) to characterize transfer characteristics, including the nature of the datasets transferred, achieved throughput, user behavior, and resource usage. This analysis yields new insights that can help design better data transfer tools, optimize networking and edge resources used for transfers, and improve the performance and experience for end users. Our analysis shows that (i) most of the datasets as well as individual files transferred are very small; (ii) data corruption is not negligible for large data transfers; and (iii) the data transfer nodes utilization is low. Insights gained from our analysis suggest directions for further analysis. The resulting insights can help (i) resource providers optimize the resources used for data transfer; (ii) researchers and tool developers build new (or optimizing the existing) data transfer protocols and tools; (iii) end users organize their datasets to maximize performance; and (iv) funding agencies plan investments.