Profile


Not everything comes natural to us, but maybe that language does.

Man has an instinctive tendency to speak, as we see in the babble of our young children, while no child has an instinctive tendency to bake, brew or write. —Charles Darwin

Research Interest

My research interest focus on Text Mining, i.e., mining knowledge from text data, and use such knowledge to empower Natural Language Processing.

Education

  • UNIVERSITY OF ILLINOIS, URBANA-CHAMPAIGN (UIUC)
    • Ph.D. in Computer Science, Expected 2021
    • Advisor: Prof. Jiawei Han
    • GPA : 4.0/4.0
    • Rank: Top1%
  • UNIVERSITY OF SCIENCE AND TECHNOLOGY OF CHINA (USTC)
    • B.E. in Computer Science, 2012 - 2016
    • Advisor: Prof. Linli Xu
    • Major GPA : 3.93/4.3
    • Rank: Top5%

Awards

  • Guo Moruo Scholarship (2015)
    • the  Highest Honor for students at USTC
  • Google Excellence Scholarship (2015)
    • only 58 students shortlisted Nationwide
  • ICDM Travel Award (2015)
  • Samsung Scholarship (2015)
  • Tencent Innovative Scholarship (2014)
  • Seagate Scholarship (2013)
  • Scholarship for Outstanding Students(2013)

Professional Experience

  • DATA MINING GROUP (Research Assistant, Aug 2016 - present, Urbana, IL)
    • Proposed a Gaussian-vMF Mixture Model to cluster tweets based on geo-location and context embedding, which is utilized to detect local events and accepted by KDD 2017.
    • Design and implement a novel framework to extract relations by leveraging different supervision information sources, which is based on representation learning and accepted by EMNLP 2017.
    • Extract knowledge from raw text by language model and empower sequence labeling task, achieved state-of-the-art on both Named Entity Recognition and POS Tagging, currently under review.
  • Social Computing Group, MICROSOFT RESEARCH ASIA (Intern, July 2015 - May 2016, Beijing, China)
    • Develop and ship Bing Dictionary’s word reciting component with teammates. Design new interaction strategy allowing algorithm to capture user’ specific familiarity level of a word.
    • Explore the underlying reasoning of applying arithmetic on embedding vectors. Try to employ sparse coding to improve cosine similarity in the embedding space.
  • UCOLLEGE STUDENT RESEARCH PROGRAM (Research Assistant, Oct. 2013 - Dec. 2014)
    • Independently conducted the project called Community Detection based on Information Network and Its Application. Persisted in modifying model and achieved precision 2 times higher than beginning.
    • Composed and submitted a paper to the International Conference on Data Mining (ICDM 2015) as the first author, which has been accepted as Regular Paper.

Competition Experience

  • WSDM Cup 2017, Vandalism Detection Task:
    • 3th place