Abstract:
|
In text mining and gene expressions analysis, the collections of texts are represented in a vector-space model, which implies that texts once standardized, are coded as vectors in a sphere of higher dimensions, also called a hypersphere [9]. Many researchers currently model those distributions by means of existing probability density mixtures, however, these approximations waste probability mass in the whole hypersphere, when it is actually only needed at the positive orthant of the hypersphere. This is mainly because of the nonexistence of suitable distributions for that subspace. The new proposed distribution fills that void, allowing a more efficient modeling of these vectors.
|