Statistics for Social Data
Course staff
- Instructor
- Prof. Perry (pperry@stern.nyu.edu)
- Office Hours: Thursday, 2:00PM-3:50PM, KMC 8-63
Lecture slides
Handouts and assignments
Readings
- Perry and Wolfe (2013), Point process modelling
for directed interaction networks.
- Aral and Walker (2012), Identifying influential and
susceptible members of social networks.
- Newman (2006), Finding community structure in
networks using the eigenvalues of matrices.
- Bickel and Chen (2009), A nonparametric view of
network models and Newman-Girvan and other modularities.
- Handcock et al. (2007), Model-based clustering for social networks.
- Robins et al. (2007), An introduction to
exponential random graph (p-star) models for social networks.
- Robins et al. (2007), Recent developments in
exponential random graph (p-star) models for social networks.
- Liu (2011),
Opinion Mining and Sentiment Analysis. Chapter 11 from
Web Data Mining.
- Chen, Lin, and Zhou (2015).
Statistical decision making for optimal budget allocation in crowd labeling.
- Řehůřek (2014). Word2vec tutorial.
- Mikolov, Sutskever, Chen, Corrado, and Dean (2013). Distributed representations of words and phrases and their compositionality.
- Levy and Goldberg (2014). Neural word embedding as implicit matrix factorization.
- Blei, Ng, and Jordan (2003). Latent dirichlet
allocation.
- Lee and Seung (1999). Learning the parts of
objects by non-negative matrix factorization.
- Kolda and O'Leary (1998). A Semidiscrete matrix
decomposition for latent semantic indexing in information retrieval.
- Manning, Raghavan, and Schütze (2008).
Matrix Decompositions and Latent Semantic Indexing. Chapter 18
from Introduction to Information Retrieval.
- Honnibal (2013).
A Good Part-of-Speech Tagger in about 200 Lines of Python.
- Gimpel, Schneider, O'Connor, Das, Mills, Eisenstein, Heilman, Yogatama,
Flanigan, and Smith (2011). Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments.
- Jurafsky and Martin (2015). Speeach and Language Processing (3rd ed. draft):
- Sliusarenko and Dyomkin (2014).
How to split sentences.
- Read, Dridan, Oepen, and Solberg (2012).
Sentence boundary detection: A long solved problem?
- Kiss and Strunk (2006).
Unsupervised multilingual sentence boundary detection.
- Mosteller and Wallace (1963).
Inference in an authorship problem.
Datasets