CARBONE Anna

Kullback-Leibler cluster entropy: An inferential tool for long-range correlated data

A short overview is offered of an information-theoretical measure, the relative cluster entropy DC[PkQ] to discriminate among cluster partitions characterised by probability distribution functions P and Q. The measure is illustrated with the clusters generated by pairs of fractional Brownian motions with Hurst exponents H1 and H2 respectively. For subdffusive, normal and superdffusive sequences, the relative entropy sensibly depends on the difference between H1 and H2. By using the minimum relative entropy principle, cluster sequences characterized by di_erent correlation degrees are distinguished and the optimal Hurst exponent is selected. We present results for financial and genomic data sequences.

As a first case study, real-world cluster partitions of market price series are compared to those obtained from fully uncorrelated sequences (simple Browniam motions) assumed as a model. The minimum relative cluster entropy yields optimal Hurst exponents H1 = 0:55;H1 = 0:57, and H1 = 0:63 respectively for the prices of DJIA, S&P500, NASDAQ: a clear indication of non-markovianity. The relative cluster entropy DC[PkQ] is evaluated for the empirical and model probability distributions P and Q of the clusters formed in the realized volatility time series of _ve assets (SP&500, NASDAQ, DJIA, DAX, FTSEMIB). The Kullback-Leibler functional DC[PkQ] provides complementary perspectives about the stochastic volatility process compared to the Shannon functional SC[P]. While DC[PkQ] is maximum at the short time scales, SC[P] is maximum at the large time scales leading to complementary optimization criteria tracing back respectively to the maximum and minimum relative entropy evolution principles. The realized volatility is modelled as a time-dependent fractional stochastic process characterized by power-law decaying distributions with positive correlation . A multiperiod portfolio built on diversity indexes derived from the Kullback-Leibler entropy measure of the realized volatility. The portfolio is robust and exhibits better performances over the horizon periods. A comparison with the portfolio built either according to the uniform distribution or in the framework of the Markowitz theory is also reported.

We study the recently assembled T2T-CHM13 human reference genome which includes all centromeric regions and the entire short arms of five chromosomes thus completing all the gaps still present in the GHRC38 assembly. We systematically conduct the statistical analysis of all the 24 chromosomes of the human reference genome by using the recently proposed relative cluster entropy, a measure of divergence between probability distribution functions. We observe a sound dependence of the relative cluster entropy in correspondence to the newly added gaps in all the T2T-CHM13 chromosomes. We discuss potential biological implications and set future research directions.

Further readings at www.polito.it/noiselab