Evolution is the process by which populations of organisms undergo genetic changes over successive generations to fit to the environment. The genome of an organism contains its complete set of genetic instructions, including the information necessary for its development, functioning, and response to the environment. Thus, understanding the genetic variants on genomes responsible for adaptation during evolution is crucial, especially for comprehending the dynamics of fast-evolving pathogens or cancers. For example, the quick evolution of high risk pathogens, such as HIV-1 and influenza, is more likely to undergo the accumulation of advantageous mutations and enable them to evade the human immune system's defenses. In evolutionary biology, fitness refers to the measure of an organism's reproductive success adapting to the environment and its ability to contribute its genetic material to future generations. However, due to thousands to billions of base pairs on genomes, and the specific arrangements, estimating the fitness of genetic variants is a challenging task. Moreover, epistatic interactions, the effects of a genetic variant that depend on the presence of the other variants in one genetic sequence, elevate the level of challenge. Although researchers are using advanced quantitative methods to decode these interactions, challenges still exist because of the increasing dimensions of the fitness landscape and difficulties in interpreting quantitative measurements.
To quantify the mutational effects of genetic variants, this work presents a method, Marginal Path Likelihood (MPL), inferring fitness parameters from observed evolutionary histories of genetic sequences. By extending the inference framework with epistatic interactions, this approach quantitatively measures the probability of an evolutionary path using a path integral derived from statistical physics, and estimates the fitness parameters, including the relative fitness (selection coefficient) and fitness that differs from the sum of the fitness effects of each individual mutant (epistasis), that best explain an observed evolutionary trajectory with Bayesian theorem. With the help of evolutionary simulation and mutagenesis experiments, this approach proves to be more consistent and explanatory than the current state-of-art methods, even within finite-sampling scenarios. In mutagenesis experiments, a large scale of genetic variants are generated and helps us to explore the functional consequences of numerous genetic variants simultaneously. Then, In this work, a pipeline package, popDMS, is also reported to process this kind of genetic time series data automatically.