hdp/hdp.h Documentation

hdp.h

This file contains data structures that implement Dirichlet processes and hierarchical Dirichlet processes. This file does not implement inference algorithms for these structures. See hdp/mcmc.h for examples that use DPs and HDPs and perform inference.

Dirichlet processes

A Dirichlet process (DP) can be understood as a distribution over distributions. That is, samples from a Dirichlet process are themselves distributions. A Dirichlet process is characterized by two parameters: a real-valued concentration parameter $ \alpha > 0 $, and a base distribution $ H $. So if we let $ G $ be a random variable distributed according to a Dirichlet process with parameters $ \alpha $ and $ H $, we can express this as:

\[ G \sim \text{DP}(H, \alpha). \]

There are a handful of equivalent representations of Dirichlet processes. One such representation is the Chinese restaurant process. Imagine a restaurant with an infinite number of tables, numbered $ 0, 1, \ldots $ On each table is an independent sample from the base distribution $ H $. When the first customer walks into the restaurant, they sit at table $ 0 $. When the second customer enters, they either sit at table $ 0 $ with probability $ 1/(1 + \alpha) $ or they sit at table $ 1 $ with probability $ \alpha/(1 + \alpha) $. More generally, when the $ n $-th customer enters the restaurant, they choose to sit at a non-empty table $ i $ with probability proportional to the number of people sitting at that table, or they sit at the next empty table with probability proportional to $ \alpha $.

This process is equivalent to the process of drawing samples from $G$.

\[ \begin{align*} G &\sim \text{DP}(H, \alpha), \\ X_1, X_2, \ldots, X_n &\sim G, \end{align*} \]

where $ X_1, \ldots, X_n $ are drawn independently and identically from $ G $. In the restaurant metaphor, $ X_1 $ is the sample from $ H $ that the first customer discovered at their table, $ X_2 $ is the sample from $ H $ that the second customer discovered, and so on. Thus, this representation describes how to draw samples from a Dirichlet process, when $ G $ is collapsed/integrated out.

It is impossible to write a closed-form expression for the distribution $ G $, since its specification requires an infinite amount of information. But for the useful applications of the Dirichlet process, this is not necessary.

Hierarchical Dirichlet processes

A hierarchical Dirichlet process (HDP) is a hierarchy of random variables, where each random variable is distributed according to a Dirichlet process with base distribution given by the parent in the hierarchy. To be more precise, given a tree $ T $. Every node $ \textbf{n} \in T $ is associated with a random variable $ G_{\textbf{n}} $ such that

\[ G_{\textbf{n}} \sim \text{DP}(G_{p(\textbf{n})}, \alpha_{\textbf{n}}), \]

where $ p(\textbf{n}) $ returns the parent node of $ \textbf{n} $ in $ T $. The root node $ \textbf{0} $ is drawn from a single root base distribution:

\[ G_{\textbf{0}} \sim \text{DP}(H, \alpha_{\textbf{0}}). \]

Note that the concentration parameter may also differ across the nodes in the tree.

The HDP allows statistical information to be shared across groups, and is useful for modeling clustered data.

DP/HDP mixture models

DPs and HDPs are frequently used in mixture models, where the samples from the DP/HDP are not themselves directly observed. Rather, they are inputs to another distribution, which in turn, provides the observed samples. For example, the following is a simple DP mixture model, where the base distribution is Beta and the likelihood is Bernoulli:

\[ \begin{align*} H &= \text{Beta}(2, 4), \\ G &\sim \text{DP}(H, 0.1), \\ X_1, \ldots, X_n &\sim G, \\ Y_i &\sim \text{Bernoulli}(X_i) \text{ where } i = 0, \ldots, n. \end{align*} \]

In the mixture model, we observe $ Y_i $ but not $ X_i $. If the user wishes to use the DP/HDP samples directly, they can do so by using the constant (degenerate) distribution as the likelihood.

Classes, functions, and variables in this file
#define	IMPLICIT_NODE
struct	node
bool	init (node< K, V > & n, const V & alpha)
bool	copy (const node< K, V > & src, node< K, V > & dst, hash_map< const node< K, V > , node< K, V > > & node_map)
struct	node_scribe
bool	read (node< K, V > & node, FILE * in, node_scribe< AtomReader, KeyReader > & node_reader)
bool	write (const node< K, V > & node, FILE * out, node_scribe< AtomWriter, KeyWriter > & node_writer)
struct	hdp
bool	init (hdp< BaseDistribution, DataDistribution, K, V > & h, BaseParameters & base_params, const V * alpha, unsigned int depth)
bool	copy (const hdp< BaseDistribution, DataDistribution, K, V > & src, hdp< BaseDistribution, DataDistribution, K, V > & dst, hash_map< const node< K, V > , node< K, V > > & node_map)
bool	read (hdp< BaseDistribution, DataDistribution, K, V > & h, FILE * in, BaseDistributionScribe & base_reader, AtomScribe & atom_reader, KeyScribe & key_reader)
bool	write (const hdp< BaseDistribution, DataDistribution, K, V > & h, FILE * out, BaseDistributionScribe & base_writer, AtomScribe & atom_writer, KeyScribe & key_writer)
bool	add (hdp< BaseDistribution, DataDistribution, K, V > & h, const unsigned int * path, unsigned int depth, const K & observation)
bool	contains (NodeType & n, const unsigned int * path, unsigned int length, const typename NodeType::atom_type & observation)

K	the generic type of the observations drawn from this distribution.
V	the type of the probabilities.

Public members
array_map< unsigned int, node< K, V > >	children
V	alpha
V	log_alpha
array< K >	observations
	node (const V & alpha)
V	get_alpha () const
V	get_log_alpha () const
static void	move (const node< K, V > & src, node< K, V > & dst)
static void	swap (node< K, V > & first, node< K, V > & second)
static void	free (node< K, V > & n)
typedef	K atom_type
typedef	V value_type

const node< K, V > &	src,
node< K, V > &	dst,
hash_map< const node< K, V > , node< K, V > > &	node_map	)

Public members
AtomScribe &	atom_scribe
KeyScribe &	key_scribe
	node_scribe (AtomScribe & atom_scribe, KeyScribe & key_scribe)

node< K, V > &	node,
FILE *	in,
node_scribe< AtomReader, KeyReader > &	node_reader	)

AtomScribe	a scribe type for which the function `bool read(K&, FILE*, AtomScribe&)` is defined.
KeyScribe	a scribe type for which the function `bool read(unsigned int&, FILE*, KeyScribe&)` is defined.

BaseDistribution	the type of the base distribution.
DataDistribution	the type of the likelihood (as in a DP/HDP mixture model).
K	the generic type of the observations drawn from this distribution.
V	the type of the probabilities.

Public members
BaseDistribution	pi
unsigned int	depth
V *	alpha
V	log_alpha
array_map< unsigned int, node< K, V > >	children
array< K >	observations
	hdp (const BaseParameters & base_params, const V * alpha, unsigned int depth)
V	get_alpha () const
V	get_log_alpha () const
static void	free (hdp< BaseDistribution, DataDistribution, K, V > & h)
typedef	K atom_type
typedef	V value_type
typedef	BaseDistribution base_distribution_type
typedef	DataDistribution data_distribution_type

const BaseParameters &	base_params,
const V *	alpha,
unsigned int	depth	)

hdp< BaseDistribution, DataDistribution, K, V > &	h,
BaseParameters &	base_params,
const V *	alpha,
unsigned int	depth	)

const hdp< BaseDistribution, DataDistribution, K, V > &	src,
hdp< BaseDistribution, DataDistribution, K, V > &	dst,
hash_map< const node< K, V > , node< K, V > > &	node_map	)

hdp< BaseDistribution, DataDistribution, K, V > &	h,
FILE *	in,
BaseDistributionScribe &	base_reader,
AtomScribe &	atom_reader,
KeyScribe &	key_reader	)

BaseDistributionScribe	a scribe type for which the function `bool read(BaseDistribution&, FILE*, BaseDistributionScribe&)` is defined.
AtomScribe	a scribe type for which the function `bool read(K&, FILE*, AtomScribe&)` is defined.
KeyScribe	a scribe type for which the function `bool read(unsigned int&, FILE*, KeyScribe&)` is defined.

const hdp< BaseDistribution, DataDistribution, K, V > &	h,
FILE *	out,
BaseDistributionScribe &	base_writer,
AtomScribe &	atom_writer,
KeyScribe &	key_writer	)

hdp< BaseDistribution, DataDistribution, K, V > &	h,
const unsigned int *	path,
unsigned int	depth,
const K &	observation	)

NodeType &	n,
const unsigned int *	path,
unsigned int	length,
const typename NodeType::atom_type &	observation	)

struct node[view source]

struct node_scribe[view source]

struct hdp[view source]

struct node
[view source]

struct node_scribe
[view source]

struct hdp
[view source]