Research Article | | Peer-Reviewed

Integrated Machine Learning-based Toxicity Prediction with Molecular Docking for Safer Drug Candidate Screening in Parkinson’s Disease

Received: 8 May 2026     Accepted: 18 May 2026     Published: 27 June 2026
Views:       Downloads:
Abstract

Parkinson disease is a progressive neurodegenerative disorder marked by the abnormal buildup of α-synuclein into toxic fibrils, which lead to neuronal degeneration and motor problems. Among all the identified variants, the type 5A polymorphic structure (8PK4) has been strongly associated with disease progression and represents a promising therapeutic target for the development of safer and more effective drug candidates. In the present study, an integrated computational framework combining with molecular docking and machine-learning-based toxicity prediction was employed to identify potential natural compounds with high therapeutic efficacy and minimal toxic effects. Five bioactive phytochemicals, namely baicalein, rutin, ellagic acid, kaempferol, and ferulic acid, were selected based on their reported neuroprotective potential and screened against the α-synuclein target protein. Molecular docking analysis was performed using the CB-Dock platform to evaluate binding affinity, interaction stability, and residue-level interactions within the active binding pocket. The results demonstrated that all selected compounds exhibited favourable binding interactions with critical amino acid residues, particularly PHE4, LYS6, and GLU35, which are associated with α-synuclein aggregation and stabilization. Among the tested compounds, ellagic acid displayed the strongest binding affinity and the most stable interaction profile, suggesting enhanced inhibitory potential against the target protein. To further assess drug safety, toxicity predictions were performed using the ProTox-II machine-learning platform, evaluating multiple toxicity endpoints, including hepatotoxicity, neurotoxicity, mutagenicity, carcinogenicity, immunotoxicity, and cytochrome P450-mediated interactions. The toxicity assessment revealed that ellagic acid exhibited the lowest predicted toxicity among all screened compounds, while rutin showed a comparatively high LD50 value, indicating reduced acute toxicity and a favourable safety margin. The integration of molecular docking with artificial intelligence-driven toxicity prediction provides a rapid, cost-effective, and reliable strategy for safer drug candidate screening in Parkinson’s disease research. Overall, the study highlights the potential of natural compounds, particularly ellagic acid, as promising therapeutic leads for further experimental validation and future neuroprotective drug development.

Published in Computational Biology and Bioinformatics (Volume 14, Issue 1)
DOI 10.11648/j.cbb.20261401.14
Page(s) 41-53
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2026. Published by Science Publishing Group

Keywords

Parkinson’s Disease, α-Synuclein, 8PK4 Polymorph, Molecular Docking, Machine Learning, Toxicity Prediction, ProTox-II, Drug Discovery

1. Introduction
Parkinson's Disease (PD) is the second most common neurodegenerative disease characterized by the progressive death of neurons that produce dopamine in the part of the brain known as the substantia nigra, as well as the accumulation of improperly folded proteins (protein aggregates) in the brain. One of the key features of the disease is the formation of amyloid fibrils made up of alpha-synuclein protein, disrupting the normal functioning of neurons, interfering with how synapses work, and causing toxicity within the nervous system .
Recent advancements in the understanding of the structure of alpha-synuclein have shown that it exists in many forms (or polymorphs), and that different polymorphs have varying biochemical and pathological characteristics. One of these polymorphs, the type 5A fibrillar polymorph represented by 8PK4, has gained much attention due to its stability in the presence of stress; its propensity to form aggregates, and its association with PD progression. .
While substantial progress has been made toward understanding the pathology of PD, there are still very few therapeutic strategies that directly target the aggregation of alpha-synuclein. Traditional drug development is often lengthy, expensive, and has a high failure rate due to the common occurrence of adverse drug reactions at later stages of development, and due to the inadequate pharmacokinetic properties exhibited by so many compounds. Therefore, there is an urgent need to use integrative computational methods that can evaluate both the binding affinities and the safety properties of compounds at the same time in the early phases of drug discovery. .
Advancements in artificial intelligence (AI), combined with advances in machine learning (ML), have changed the way drugs are discovered by allowing for the quick screening and prediction of molecular interactions and toxicity endpoints. ML models allow for the modelling of complex structure–activity relationships through the identification of patterns, which are sometimes difficult to establish using traditional approaches. The combination of molecular docking with ML techniques has produced a powerful method of compound prioritisation based on biological relevance and binding affinity. This method has been used in the Biopython-based drug discovery pipelines for cancer models that were developed by Uma Kumari and others. Toxicity can be predicted by using toxicological prediction databases such as ProTox-II to predict whether a drug will produce adverse effects, including neurotoxicity, mutagenicity, hepatotoxicity and whether a drug will cross the blood–brain barrier (BBB). Identifying these toxic liabilities early in the drug discovery process leads to a reduced chance of late-stage failure, resulting in increased efficiency of the drug discovery process. .
Our previous studies highlighted key residues and structural motifs contributing to fibril stability/aggregation of 8PK4 polymorphs through multi-omic integration and BioPython approaches to provide structural/function-based insight into their role as a potential therapeutic target for Parkinson’s disease. The new study presents an integrated computational pipeline to predict toxicity using machine learning-based methods as well as molecular docking to discover safe/effective drug candidates to target α-synuclein while extending Biopython, NGS, and structure-based drug discovery paradigms through a unique approach.
Bioactive compounds were selected and evaluated via molecular docking for their binding affinities to the 8PK4 structure, followed by toxicity characterisation using machine learning (ML)-based predictive tools. Combining efficacy and safety assessments is used to produce a rational prioritisation of drug candidates with maximal therapeutic potential compared with previous Bio Python and docking workflows for MBP-MCL1 in myeloid cell leukaemia, c-MET inhibitors in glioma mesothelin, AIF, NUDT5, FGFR2 and other clinically relevant targets documented by Uma Kumari et al. .
In conclusion, our study emphasises the importance of combining structural bioinformatics, ML, and toxicity prediction methods to provide a faster pathway for identifying safe/effective therapeutic agents for Parkinson's Disease, particularly those that address the pathophysiology associated with 8PK4 polymorph forms of α-synuclein. The strategy directly builds on the integrated multi-omics and Biopython framework previously applied for target identification of α-synuclein polymorphs in PD by our group.
2. Materials and Methods
2.1. Ligand Selection
A set of biologically relevant compounds, such as endogenous metabolites, oxidative stress markers, and small bioactive molecules, was selected for docking, and their structure were downloaded from PubChem.
2.2. Molecular Docking with CB-Dock
Molecular docking was performed using the CB-Dock blind docking server, which combines cavity detection with AutoDock Vina. The docking workflow includes:
1) Upload of processed protein structure (8PK4);
2) Automatic identification of potential binding cavities;
3) Docking of ligands into the predicted binding pockets;
4) Ranking of binding poses based on Vina score;
Output parameters include: - Binding affinity (kcal/mol); Binding cavity size/location; - Docked ligand conformations.
2.3. Creation of a Compound-protein Interaction Network
Based on the analysis performed using Biopython code to extract papers having compounds related to alpha synuclein, the result was used to create a compound-protein interaction network to visualise the interaction landscape of the 8PK4 structure. The centre node represents the target protein, while the outer nodes represent compounds that interact with it, and the edges indicate the interactions. This network allowed for the identification of highly interacting compounds and provided insight into potential modulators of α-synuclein aggregation.
2.4. Sequence Retrieval and Similarity Analysis
The protein sequence corresponding to the 8PK4 structure was extracted from Biopython’s PPBuilder module, and sequence similarity analysis was performed using BLASTP against the NCBI non-redundant database using BioPython code. The top homologous sequences were then retrieved for downstream machine learning analysis.
2.5. Creating Protein Embeddings Using Machine Learning
To provide numerical representations of the various sequences of proteins, transformer-based models (e.g., ProtBERT and ESM), specifically converting the sequences into numerical format in the following way:
1) The sequences were tokenised and passed through the pretrained models.
2) The outputs from the hidden layers were extracted.
3) Mean pooling was applied to produce fixed-length embeddings.
The embeddings of the proteins captured information that is structure/function/evolutional based.
2.6. Embedding Integration and Clustering
In addition to ProtBERT, ESM also produced embeddings which were concatenated to form a combined feature vector, and hierarchical clustering analysis was performed using Ward's Method to create groups of similar sequences.
Visualisation Methods:
Dendrogram (or phylogram): for visualising similarity relationships.
Sankey Diagram: to show the distribution of clusters.
By performing this analysis, we were able to demonstrate the existence of separate protein clusters based upon embedded representations.
2.7. AI/ML-based Candidate Prioritisation
By integrating both docking results and machine learning information, we were able to prioritise candidate compounds. Compounds were included based on three factors:
1) The binding affinity (via CB-Dock).
2) The patterns of interactions exhibited.
3) Network connectivity.
Through this integrated approach, we were able to determine the most biologically relevant candidates.
2.8. ADMET Prediction: ProTox
We evaluated the shortlisted compounds for toxicity.
Toxicity parameters included:
1) Toxicity Classification
2) LD50 (median lethal dosage)
3) Hepatotoxicity
4) Carcinogenicity
5) Immunotoxicity
This step of the process also provided reassurance that the selected compounds will have an acceptable safety and pharmacokinetic profile.
2.9. All Analyses Were Performed Using Python-based Tools; Specifically
1) Matplotlib
2) Seabro
3) Plotly (for Sankey diagrams and network visualization)
3. Results and Discussion
3.1. Molecular Docking Analysis of Selected Compounds Against α-Synuclein (8PK4)
We conducted molecular docking with selected bioactive compounds to determine their potential to bind to the α-synuclein polymorphic structure, 8PK4, which represents a fibrillar form related to Parkinson's Disease. The docking application used in this case was CB-Dock, which performs cavity detection and scores with AutoDock Vina to find optimal binding sites for the ligands .
3.1.1. Analysis of Binding Affinity
All of the selected compounds had favourable binding interactions with the target protein and thus had varying degrees of binding strength (Figure 2).
1) Ellagic acid has the highest binding affinity with a Vina score of -9.0 kcal/mol, indicating that it has the most stable binding among the selected compounds.
2) In terms of binding affinity, baicalein (Vina score -8.8 kcal/mol) is very close behind ellagic acid and indicates a strong potential to bind to the fibrillar structure.
3) Kaempferol was found to have moderate binding affinity at a Vina score of -8.7 kcal/mol, with most binding sites providing similar binding affinity.
4) Ferulic acid had the lowest binding affinity of the selected compounds (-6.6 kcal/mol) and, therefore, displayed the weakest potential to bind to the fibrillar structure.
5) Rutin displayed strong binding to the fibrillar structure by engaging conserved residues across multiple chains .
Figure 1. Molecular docking interactions of selected bioactive compounds with α-synuclein polymorph 8PK4, a) Baicalein, b) Rutin, c) Ellagic acid, d) Kaempferol e) Ferulic acid docked within the predicted binding cavities of the protein. The dominance of α-helices suggests structural rigidity and functional specificity. The interaction pattern (hydrogen bonds + hydrophobic contacts) indicates: Strong binding affinity, structure is dominated by α-helices (yellow regions) connected by loops (grey). The central region contains a ligand (grey spherical model).
3.1.2. Cavity Preference and Binding Site Analysis
CB-Dock identified five separate binding sites (Cavities C1-C5) in the fibrillar structure. Among these, cavity C5 (vol ~824 ų) had the highest binding affinity for most of the selected compounds. Larger cavities, such as C1 (~1395 ų), also accommodated ligand binding but with slightly reduced affinity.
This suggests that binding affinity is not solely dependent on cavity size, but rather on the complementarity between ligand structure and pocket environment. .
3.1.3. Residue-level Interaction Analysis
The residue-level interaction analysis indicated that the binding of the ligands occurs mainly in the central fibrillar core area and involves conserved residues from different chains.
Some of the common interacting residues include:
1) PHE4 → hydrophobic stack interaction
2) MET5 → stabilising hydrophobic contact
3) LYS6 / LYS21 / LYS23 / LYS32 / LYS34 → electro-static and hydrogen bond interaction
4) GLU20 / GLU28 / GLU35 / GLU46 / GLU35 → polar and ionic interaction
5) THR33, GLY36, VAL37 → structural stabilization MET5 → hydrophobic stabilization
Kaempferol is shown to interact within a larger network of residues (inclusive of THR33, GLU46, VAL37), suggesting a wider area of surface interaction. The interaction of ellagic acid and baicalein was more concentrated in their core residues as compared to kaempferol, suggesting a higher binding affinity with ellagic acid and baicalein. Ferulic acid was shown to have a limited number of interacting residues, suggesting a lower binding score. The interaction pattern of rutin was very similar to that of baicalein and ellagic acid, indicating binding in the same position of the core fibrillar region. .
3.1.4. Multi-chain Interaction and Binding Mechanism
An interesting observation in relation to the docking results for all the molecules is that all of the compounds interacted with residues from various chains (A, C, E, G, I). This signifies that: The α-synuclein fibrils are stabilised via interchain interactions, and the disruption of interchain interactions will destabilise the fibril formation. Potentially, compounds such as ellagic acid, baicalein and rutin may serve as aggregation inhibitors via targeting inter-chain contacts. .
3.2. Explanation for Structural Differences in Binding
Structural properties of the compounds are critical to understand differences in binding affinity.
Ellagic Acid has a rigid polyphenolic structure with many hydroxyl groups, which create strong hydrogen bonds. Ellagic acid has planarity in its ring structure, allowing for π-π stacking with aromatic residues at location Phe4. Baicalein has a flavonoid structure with both hydrophobic and polar interactions, providing an excellent balance between these two interaction types. Kaempferol is similar to baicalein, but the structural orientation of kaempferol is not as optimal. Ferulic Acid is smaller and less complex than either ellagic acid or kaempferol, and as such, has fewer sites for interaction and binds less strongly than either of these. Rutin is a glycosylated flavonoid, which gives it three major characteristics:
1) Many hydroxyl (-OH) groups, allowing for strong hydrogen bonding capability.
2) Aromatic rings, which allow for both hydrophobic and π-π interactions.
3) A large molecular size will allow for multiple interactions with multiple residues at one time, compared to ellagic acid and kaempferol.
Based on the docking analysis, the results indicate that the C5 cavity represents a significant druggable site. The binding is driven by: Hydrophobic interactions, Electrostatic interactions, Hydrogen bonding and the more potent ligands are those having: Low binding energy, Interaction with conserved core residues, multi-chain binding. So, the overall conclusion from the docking result is that the: Ellagic acid remains the strongest binder, Baicalein and Rutin emerge as highly promising candidates, Kaempferol shows moderate interaction, and Ferulic acid shows comparatively weaker binding. .
3.3. Compound-protein Interaction Network Analysis
Figure 2. Compound-protein interaction network of 8PK4 showing interactions with diverse molecules, including neurotransmitters, oxidative stress markers, and bioactive compounds, highlighting its central role in Parkinson’s disease-associated pathways.
We created a compound-protein interaction network to explore the interactions of the α-synuclein polymorph 8PK4 with many different biologically relevant compounds (Figure 3). The hub-and-spoke architecture of this network features the 8PK4 protein at the centre and numerous surrounding compound nodes, indicating that this protein is capable of interacting with a broad range of chemically and functionally diverse compounds. Compounds interacting with 8PK4 in the network included neurotransmitters (dopamine, serotonin, norepinephrine), metabolic intermediates (inosine, phosphate, riboflavin), oxidative stress-related compounds (superoxide, glutathione, malondialdehyde), environmental neurotoxicants (paraquat, MPTP, pesticides), and natural bioactive compounds (resveratrol, quercetin, indoles). The varied number of interacting partners indicated that α-synuclein is multifunctional and participates in multiple biological pathways. The strong relationship between α-synuclein and dopamine, as well as the interaction of dopamine and its toxic metabolites (e.g., DOPAL) with α-synuclein, are two particularly striking results from the network. The association of these compounds.
And the α-synuclein protein will support the previously established pathophysiology of Parkinson's disease, which describes how oxidative stress from dopamine's oxidation creates reactive intermediates involved in the misfolding and aggregation of the α-synuclein protein. The network further illustrates this pathway by demonstrating direct linkages between the α-synuclein protein and several known dopamine-related compounds, oxidative stress markers, including superoxide and malondialdehyde, indicating that oxidative stress plays a central role in modulating protein behaviour and aggregation propensity. α-Synuclein interacts with glutathione (GSH), which is one of the most abundant intracellular antioxidants, to possibly confer a protective effect/compensatory response against oxidative injury. However, there appears to be an imbalance between oxidative stress and antioxidants due to the presence of reactive oxygen species and products of lipid peroxidation that may aid in the progression of α-synuclein aggregation and neurodegeneration.
Other noteworthy aspects of the network include the identification of neurotoxins from pollution, such as paraquat, which is commonly known to produce Parkinson-like symptoms, and MPTP. The association between these neurotoxins and α-synuclein suggests that environmental factors (from the outside world) may modify the structure and stability of α-synuclein and thereby contribute to the development of neuropathology. The connection between external factors and neurodegenerative disease processes is provided through mechanisms associated with this relationship. .
Within the network of identified compounds, the presence of bioactive compounds from nature, such as resveratrol, quercetin, and indoles, demonstrates antioxidant and neuroprotective activities. This finding is supported by the strong correlation between the results from the molecular docking analysis in this study and the binding affinity of structurally similar compounds such as ellagic acid, baicalein, and rutin to α-synuclein. Moreover, the overlap between the bioactive compounds in the networks and the compounds identified in docking provides further evidence that polyphenolic compounds have potential in targeting α-synuclein to inhibit its aggregation.
3.4. Machine Learning-based Clustering Analysis
Utilising machine learning methods to analyse the evolutionary and structural relationships among the protein sequences belonging to the α-synuclein polymorph 8PK4, we used transformer-type (ProtBERT, ESM) models to create embeddings for each protein sequence and perform hierarchical clustering on proteins that were similar based on their feature representations. .
The resulting dendrogram (phylogram) indicated that the protein sequences can be grouped into separate clusters based on their evolutionary and structural similarity; sequences that were closely related clustered tightly with short branches, while those with significant divergence were spread out over a larger distance (Figure 4). This demonstrates that the learned feature representations (embeddings) effectively captured fundamental features of the protein sequences, including typical patterns (motifs) and structural characteristics.
In our analysis, one large cluster appeared that contained multiple homologous protein sequences. The amount of similarity among the sequences in this cluster may point to functional domains being conserved among them. On the other hand, some small clusters of unrelated protein sequences exhibited distinct sequence and/or structural architecture that was not closely related, indicating some likelihood of differing functional capabilities. The degree of differences in amino acid composition and arrangement of the sequences probably contributes to determining how proteins will fold and interact with other proteins.
In addition, the Sankey diagram provided insight into the distribution of protein sequences within their various clusters, which provided clearly apparent representations of cluster membership. The sequences flowing into clusters demonstrate that grouping was consistent and support the findings of hierarchical clustering analysis (Figure 5). A large cluster that contained most of the sequences suggests that most proteins have a common ancestor or template. Smaller clusters may represent other unique or specialised proteins, which may have different interaction tendencies. Through a biological perspective, the clustering results provide further support for the biological significance of the α-synuclein protein sequences that were analysed. Thus far, we have been able to identify preserved clusters that indicate certain important structural characteristics necessary for forming fibrils are retained throughout the different homologs analysed. Therefore, we can trust that the 8PK4 structure will serve as a reliable model from which to consider processes related to protein aggregation and interactions with one another. .
Along with this, integrating transformer-based embeddings into the analysis improves the clusters generated by incorporating contexts and sequences that are not captured by traditional alignment-based analyses. Contrary to standard methods that produce sequence similarities, the transformer means of producing protein sequences takes into consideration long-range sequence dependencies and multiple amino acid variances when building clusters, which leads to a clustering process with more accurate results.
Clustering analyses enhance the validation of docking results, as proteins that fall within the same cluster will more than likely exhibit most of the same structural behaviour and binding interactions.
Figure 3. Phylogenetic tree (dendrogram) clustering of protein sequences associated with alpha synuclein polymorphic variance (8PK4) was conducted using an integrated transformer-derived embedding method (i.e., ProtBERT and ESM). Clustering was completed using Ward's minimum variance clustering method, with branch distances indicating the degree of difference between protein sequences. Closely spaced protein sequences in the same cluster have similar structure and evolution. Distantly spaced protein sequences are likely to be different due to protein structural features. .
Figure 4. 8PK4 protein clustering from an automated or machine learning-based embedding analysis (with examples of homologous/substituted sequences being compared) is shown in the Sankey diagram to the left. Protein sequences on the left are individual proteins (nodes) and those on the right are clusters of like proteins (nodes). The Sankey diagram depicts the flow of these protein sequences to the clusters they are associated with, as well as the major clusters of similar protein sequences as interpreted by the machine-learning or substituted sequence analysis.
A brief review of existing data has led us to conclude that the identified compounds via docking methods (Ellagic acid, Baicalein, Rutin) have high affinities to binding regions that are conserved amongst protein family members. Overall, the clustering analyses yielded a high level of agreement with the evolutionary lineage of the structure and function of the protein polymorph 8PK4 to more traditional sequence homologues, thus supporting its biological significance. The simultaneous applications of the dendrogram and the Sankey visualisation have provided both hierarchy and flow views for evaluating protein species similarity using protein structure. These conclusions are complementary to the integrated computational pipeline and shed light on the relationships between sequence and structure when compared against docking and interaction analysis, therefore providing how to identify potential therapeutic compounds.
3.5. Integrated Toxicity and Safety Analysis
To assess the toxicity, off-target effects, and pharmacological suitability of selected bioactive compounds targeting α-synuclein polymorph 8PK4, the toxicity profiling was done with an online database tool called ProTox-II. Results from the acute toxicity testing finished the following way:
1. Rutin was shown to have the highest LD50 (5000 mg/kg), classifying Rutin as Class V, with the lowest acute toxicity and greatest safety margin compared with other test articles (Table 1).
2. Baicalein and Kaempferol were also shown to possess a high level of safety (LD50 of 3919 mg/kg Class V).
3. Conversely, Ellagic acid (LD50 of 2991 mg/kg Class IV) and Ferulic acid (LD50 of 1190 mg/kg Class IV) were shown to have relatively greater levels of acute toxicity but were still considered acceptable levels of acute toxicity compared with the other compounds (i.e. Rutin, Baicalein, and Kaempferol). The predicted toxicity of each compound based upon the specific target toxicity endpoints also supports Rutin as the least toxic compound with no predicted activity in any of the major toxicity endpoints; i.e. neurotoxicity, mutagenicity, and enzyme interactions. The high LD50 for Rutin further supports the low toxicity for Rutin and Rutin is clearly the most favourable candidate (from a safety perspective). The predicted toxicity level of Ellagic acid, while classified as a Class IV compound, indicated relatively low predicted toxicity for most of the toxicity endpoints, including the absence of neurotoxicity, mutagenicity, and cytochrome P450 inhibition. Therefore, because Ellagic acid has an acceptable level of predicted toxicity and an acceptable level of predicted toxicity compared with the other compounds (due to superior docking affinity), Ellagic Acid a strong candidate for development. Although Baicalein is categorised as a Class V compound with relatively low predicted toxicity, Baicalein is predicted to exhibit mutagenic activity and to interact with tumour suppressor p53. The results of this analysis indicate that Baicalein presents a potential genotoxicity risk, which could limit the use of Baicalein in the treatment of various medical conditions, even with its favourable docking response. Kaempferol (class V) has been shown to interact with mitochondrial membrane potential disruption and CYP2D6. This suggests it may affect both cellular metabolism and drug metabolism pathways. Thus, these mechanisms may contribute to off-target toxicity under specific conditions. Ferulic acid is an established antioxidant and shows the lowest predicted LD50 of the compounds tested while also being predicted to be neurotoxic and act on important enzymes such as AChE and CYP3A4. This indicates that ferulic acid has increased chances of causing adverse effects and drug–drug interactions, reducing its viability as a therapeutic candidate.
Table 1. ProTox-II-based toxicity prediction of selected compounds showing activity across multiple toxicity endpoints and enabling comparative safety assessment for targeting 8PK4.

Target

Baicalein

Rutin

Ferulic acid

Kaempferol

Ellagic acid

LD50 mg/kg

3919

5000

1190

3919

2991

Toxicity Class

Class V

Class V

Class IV

Class V

Class IV

Neurotoxicity

Inactive (0.89)

Inactive (0.89)

Active (0.87)

Inactive (0.89)

Inactive (0.91)

Respiratory toxicity

Active (0.83)

Active (0.63)

Active (0.98)

Active (0.83)

Active (0.84)

Mutagenicity

Active (0.51)

Inactive (0.88)

Inactive (0.97)

Inactive (0.52)

Inactive (0.84)

BBB-barrier

Active (0.53)

Inactive (0.64)

Inactive (1)

Active (0.57)

Inactive (0.90)

Nutritional toxicity

Inactive (0.53)

Inactive (0.75)

Inactive (0.56)

Active (0.66)

Active (0.60)

Peroxisome Proliferator Activated Receptor Gamma (PPAR-Gamma)

Active (0.63)

Active (0.54)

Inactive (0.74)

Inactive (0.95)

Active (0.71)

Nuclear factor (erythroid-derived 2)-like 2/antioxidant responsive element (nrf2/ARE)

Inactive (0.98)

Inactive (0.98)

Inactive (0.88)

Inactive (0.99)

Inactive (0.99)

Mitochondrial Membrane Potential (MMP)

Inactive (0.99)

Inactive (0.99)

Inactive (0.70)

Active (1)

Inactive (0.86)

Phosphoprotein (Tumour Suppressor) p53

Active (1)

Inactive (0.97)

Inactive (0.96)

Inactive (0.92)

Inactive (0.95)

GABA receptor (GABAR)

Inactive (0.97)

Inactive (0.90)

Inactive (0.96)

Inactive (0.96)

Inactive (0.66)

Glutamate N-methyl-D-aspartate receptor (NMDAR)

Inactive (0.96)

Inactive (0.96)

Inactive (0.92)

Inactive (0.92)

Inactive (0.98)

alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionate receptor (AMPAR)

Inactive (0.92)

Inactive (0.92)

Inactive (0.97)

Inactive (0.97)

Inactive (1)

Achetylcholinesterase (AChE)

Inactive (0.69)

Inactive (0.97)

Active (0.69)

Inactive (0.68)

Inactive (0.73)

Cytochrome CYP2D6

Inactive (0.85)

Inactive (0.80)

Inactive (0.85)

Active (0.62)

Inactive (0.82)

Cytochrome CYP3A4

Inactive (0.79)

Inactive (0.92)

Active (0.79)

Inactive (0.65)

Inactive (0.95)

Toxicity endpoints, including neurotoxicity, mutagenicity, and receptor interactions, were evaluated, thus providing a comprehensive assessment of the safety of the selected compounds. Ellagic acid and rutin exhibited the best overall toxicity profiles in terms of inactivity across several key toxicity endpoints, including neurotoxicity and mutagenicity. They also demonstrated little to no interaction with major neuronal receptors involved in neurotransmission (GABA, NMDA, and AMPA), suggesting minimal likelihood of causing adverse neurological effects and the possibility of little to no effect on neurotransmission pathways, both of which are particularly important qualities for drug candidates intended to treat neurodegenerative diseases. In contrast, ferulic acid is predicted to be neurotoxic, indicating potential negative effects on neuronal function despite its known antioxidant properties.
In the analysis of mutagenicity, baicalein was predicted to have mutagenic potential, whereas rutin, ferulic acid, kaempferol, and ellagic acid were shown to be non-mutagenic. Baicalein's predicted interaction with the tumour suppressor protein p53 suggests possible activation of the DNA damage response or genotoxic stress, which raises concerns about its long-term safety profile. In contrast, the lack of effective binding to cytochrome P450 isoenzymes demonstrates the higher feasibility as a safer alternative with limited activity. And while the activity was predicted for each compound to be classified as "active," the different probabilities suggest the need for experimental validation due to the possibility of specific class limitations as defined by the binding affinity of the enzyme. Still, the lack of activity does not diminish the total advantage of safety seen in the selected lead molecules.
The assessment of blood-brain barrier permeability demonstrated that both baicalein and kaempferol were predicted to be BBB permeable, while rutin and both ferulic and ellagic acids were predicted not to be permeable. The predictive nature of BBB permeability is considered beneficial for the delivery of drugs to the central nervous system (CNS). However, considering the strong binding affinity and stability of interaction with ellagic acid supports its potential use in delivery through alternative means or mechanisms.
The assessment of cytochrome P450 enzyme (CYP3A4 and CYP2D6) activity indicated that ferulic and kaempferol had a propensity to cause possible drug-drug interactions and changes to metabolism. On the other hand, ellagic acid and rutin showed little to no activity when evaluated for interaction with either enzyme; therefore, given their lack of activity, ellagic acid would demonstrate a greater degree of metabolic stability than the other compounds and thus a lesser chance of exhibiting adverse pharmacokinetic characteristics. Analysis of the mitochondrial membrane potential (MMP) shows that kaempferol produced evidence of activity, suggesting it has some degree of mitochondrial impairment and potential to disrupt cellular homeostasis. These effects may be associated with cytotoxicity, depending on the conditions. In contrast, neither ellagic acid nor rutin did not demonstrate any activity in this assay, supporting the fact that those compounds could be considered safe.
Based on toxicological evaluation with ProTox-II, all of the tested compounds displayed similarities in the distribution of physicochemical properties, variability of descriptors and the clustering of toxicity data for each compound as it relates to each other (Figure 5).
The distribution of molecular weights for the compounds suggests that the majority of compounds examined had a distribution that falls within the optimal range for drug-like molecules, given their molecular weights can be described by a right-skewed distribution. However, some of the compounds may be considered pharmacologically appropriate based on their more unique molecular weight ranges.
The profile of physicochemical properties of the target compounds was established using radar plots. In general, the physicochemical properties of the compounds were well-balanced. There was minor variation in some of the physicochemical descriptors suggesting that there are differences in molecular complexity, which may also impact their binding affinity/interaction and toxicity.
The analysis of the clusters from the clustering network analysis indicated that the target compounds could be categorised into clusters based on the predicted biological activity and toxicity of the compounds. The active cluster contained active compounds with pharmacological properties that were associated with each of these respective clusters, while the cluster of compounds that contained members with the indicated toxicity potential had safety concerns. The validation of machine learning algorithms has demonstrated their ability to successfully distinguish between safer and compounds that pose a risk.
Overall, the analysis demonstrates that the majority of compounds tested have acceptable drug-like characteristics, with differentiated toxicity profiles. This study also supports the findings from the docking analysis and confirms that researchers will need to use the appropriate testing for classifying compounds with an optimal combination of efficacy and safety.
Figure 5. Comprehensive physicochemical and toxicity profiling of the studied compounds. The left panel shows the molecular weight distribution histograms, indicating a narrow distribution with mean values highlighted (red line). The middle panel presents radar plots summarising key drug-likeness and ADMET properties, demonstrating that most compounds fall within acceptable ranges. The right panel illustrates toxicity prediction networks, where compounds are clustered based on biological activity, highlighting potential toxicological endpoints (e.g., mutagenicity, carcinogenicity, and irritancy).Comprehensive physicochemical and toxicity profiling of the studied compounds. The left panel shows the molecular weight distribution histograms, indicating a narrow distribution with mean values highlighted (red line). The middle panel presents radar plots summarising key drug-likeness and ADMET properties, demonstrating that most compounds fall within acceptable ranges. The right panel illustrates toxicity prediction networks, where compounds are clustered based on biological activity, highlighting potential toxicological endpoints (e.g., mutagenicity, carcinogenicity, and irritancy).
Alternatively, all compounds were predicted to be inactive in the NRF2/ARE pathway, indicating that their therapeutic effects are unlikely to be mediated by direct activation of antioxidant response pathways. Rather, it is more likely that their mode of action involves direct binding to α-synuclein and inhibition of aggregation; thus, molecular docking data provided supporting evidence for this hypothesis. By combining predictions of toxicity with molecular docking data, the overall evaluation of the suitability of the candidate compounds is more comprehensive. Ellagic acid exhibits the best overall properties for potential therapeutic utility based on its strong binding affinity (ranked 1st) and low predicted toxicity (ranked 2nd). Rutin also has a good, predicted safety profile and could be considered a very good alternative candidate. Baicalein and kaempferol both have moderate-to-high toxicity concerns, despite having a good, predicted binding potential. Ferulic acid is predicted to have high toxicity concerns compared to the other compounds, limiting its potential therapeutic use. This study underscores the utility of conducting toxicity screening early in the process of drug discovery.
4. Conclusions and Future Direction
Using structural bioinformatics, a multi-omics approach to sequence data, machine learning-based clustering, molecular docking, and predicting toxicities, this study's computational pipeline was developed to assist in identifying compounds that may be effective treatments for targeting α-synuclein polymorph 8PK4, a critical protein involved in the development of Parkinson's disease. When 8PK4 fibrils were structurally analysed, we saw the presence of highly organised, amyloid fibrous structures that were very stable due to the presence of residues that are conserved among the chains that make up the fibril system and cause the interchain and intrachain interactions, as well as stabilise the fibril itself. The B-factor analysis, residue-level maps, and contact map analysis indicated that there were sites within the structure that were critical for maintaining structural integrity and for affecting aggregation.
These analyses provide a valuable framework for developing targeted drugs designed to affect fibril formation from α-synuclein. We used molecular docking analysis to identify a number of naturally occurring compounds that appeared to have a strong binding affinity for the 8PK4 structure. The top four compounds we identified were ellagic acid, baicalein, rutin, and kaempferol. The binding of these compounds occurs primarily at the following key residues: PHE4, LYS6, and GLU35. The interactions with the key residues occur primarily in the core region of the conserved fibril structure and at the interface where two protofilaments come into contact. Thus, there is a high probability that these compounds have the potential to inhibit the aggregation of the protein, resulting in the stabilisation of nonspecific, non-toxic forms of the protein. Molecular docking assessments revealed multiple natural bioactive substances with substantial binding affinities to the 8PK4 structure. The top ranked bioactive compounds were ellagic acid as the top ranked compound followed by baicalein, rutin, and kaempferol. Further interaction studies with the residues PHE4, LYS6, and GLU35 at the conserved fibrillar core region and protofilament interface indicated the potential capability of these compounds to disrupt protein aggregation and stabilize non-toxic conformations of the protein.
ProTox-II predictive toxicology complements LD50 (proven lethal dose) and overall toxicity classification to fully evaluate the safety profiles of the selected compounds. Rutin had the highest LD50 value and lowest predicted toxicity, which suggests that it will be extremely safe. Ellagic acid, while classified as a Class IV compound, exhibited no off-target effects and had an extremely low toxicity profile. In comparison, however, Baicalein and Kaempferol showed moderate concerns regarding their level of toxicity, while Ferulic acid displayed significantly greater toxicity and less potential as a suitable therapeutic candidate.
The integration of efficacy of docking and assessment of toxicity provided a well-rounded evaluation of the compounds for relative potency and safety. Based on these evaluations, the bioactive compound of most therapeutic support is ellagic acid due to its high binding value and toxicity designation; however, rutin was supported as the most favourable compound based on its overall high safety profile. The efficacy of applying multiple levels of computation for discovering new drugs for neurodegenerative diseases is shown throughout this research. The resulting pipeline helps to identify potential inhibitors more quickly, while providing greater confidence in selecting candidates for development by evaluating safety early on. The results provide a solid basis for future experimental validation and development of targeted therapies aimed at preventing aggregation of α-synuclein in Parkinson’s Disease.
Author Contributions
Neha Singh: Data Curation, Formal Analysis, Methodology, Writing – original draft
Uma Kumari: Conceptualization, Formal Analysis, Investigation, Supervision, Writing – original draft, Writing – review & editing
Dhanashri Mandra: Formal analysis, Resources, Software
Aashritha Marouthu: Formal Analysis, Methodology
Conflicts of Interest
The authors declare no conflicts of interest.
References
[1] Hasin, Y.; Seldin, M.; Lusis, A. Multi-Omics Approaches to Disease. Genome Biol. 2017, 18(1), 83.
[2] Kumari, P.; Ghosh, D.; Vanas, A.; Fleischmann, Y.; Wiegand, T.; Jeschke, G.; Riek, R.; Eichmann, C. Structural Insights into α-Synuclein Monomer–Fibril Interactions. Proc. Natl. Acad. Sci. U. S. A. 2021, 118(10), e2012171118.
[3] Singh, N.; Kumari, U. Mult-Omics-Based Drug Discovery Pipelines Incorporating Machine Learning, AI, and BioPython in Parkinson’s Disease: A Case Study of 5A Polymorph of Alpha-Synuclein. 2026, 13(2).
[4] Du, P.; Fan, R.; Zhang, N.; Wu, C.; Zhang, Y. Advances in Integrated Multi-Omics Analysis for Drug-Target Identification. Biomolecules 2024, 14(6), 692.
[5] Li, Y.; Zhao, C.; Luo, F.; Liu, Z.; Gui, X.; Luo, Z.; Zhang, X.; Li, D.; Liu, C.; Li, X. Amyloid Fibril Structure of α-Synuclein Determined by Cryo-Electron Microscopy. Cell Res. 2018, 28(9), 897–903.
[6] KUMARI, U.; Pacholee, K. In-Silico Drug Discovery-Based Approach to Treat Impairments in Patients of Alzheimer’s Disease. In JETIR; 2023; Vol. 10.
[7] Zhang, J.; Zheng, M.; Shi, W. Parkinson’s Disease: An Integrative Bioinformatics and Machine Learning Analysis Reveals Tryptophan Metabolism-Associated Gene Biomarkers. BMC Neurol. 2025, 25, 487.
[8] Chaudhary, S.; Kumari, U. NGS, MOLECULAR DOCKING AND NETWORK PHARMACOLOGY REVEAL POTENT INHIBITOR FOR THE TREATMENT OF LUNG CANCER. 2024, 11(9).
[9] Kumari, U. Next-Generation Sequencing (NGS) and Artificial Intelligence for Structural and Functional Analysis of KRAS-G12C in Complex with Novel Inhibitors. 2025, 12(9).
[10] Kumari, U.; Gangurde, D.; Nair, A.; Phirke, N. INTEGRATING COMPUTER AIDED DRUG DESIGN AND NEXT-GENERATION SEQUENCING TO OPTIMIZE TARGETED THERAPIES FOR PI3KΑ H1047R MUTANT IN CANCER. 2025, 12(8).
[11] Kumari, U.; Author), R. P. (First. Structure Analysis and Molecular Docking of Mesothelin-207 Fragment in Human Cancer. In JETIR; 2025; Vol. 12.
[12] Ching, T.; Himmelstein, D. S.; Beaulieu-Jones, B. K.; Kalinin, A. A.; Do, B. T.; Way, G. P.; Ferrero, E.; Agapow, P.-M.; Zietz, M.; Hoffman, M. M.; Xie, W.; Rosen, G. L.; Lengerich, B. J.; Israeli, J.; Lanchantin, J.; Woloszynek, S.; Carpenter, A. E.; Shrikumar, A.; Xu, J.; Cofer, E. M.; Lavender, C. A.; Turaga, S. C.; Alexandari, A. M.; Lu, Z.; Harris, D. J.; DeCaprio, D.; Qi, Y.; Kundaje, A.; Peng, Y.; Wiley, L. K.; Segler, M. H. S.; Boca, S. M.; Swamidass, S. J.; Huang, A.; Gitter, A.; Greene, C. S. Opportunities and Obstacles for Deep Learning in Biology and Medicine. J. R. Soc. Interface 2018, 15(141), 20170387.
[13] Kumari, U.; Pradhan, M.; Mukherjee, S.; Chakrabarti, S. NGS ANALYSIS APPROACH FOR NEURODEGENERATIVE DISEASE WITH BIOPYTHON. In JETIR; 2023; Vol. 10.
[14] Mehrotra, K.; Kumari, U.; Pande, S. Integrative Workflow Of Biopython And Molecuar Docking To Explore Novel Therapeutics Targeting.
[15] Mehra, S.; Gadhe, L.; Bera, R.; Sawner, A. S.; Maji, S. K. Structural and Functional Insights into α-Synuclein Fibril Polymorphism. Biomolecules 2021, 11(10), 1419.
[16] Liu, Y.; Grimm, M.; Dai, W.; Hou, M.; Xiao, Z.-X.; Cao, Y. CB-Dock: A Web Server for Cavity Detection-Guided Protein–Ligand Blind Docking. Acta Pharmacol. Sin. 2020, 41(1), 138–144.
[17] Trott, O.; Olson, A. J. AutoDock Vina: Improving the Speed and Accuracy of Docking with a New Scoring Function, Efficient Optimization and Multithreading. J. Comput. Chem. 2010, 31(2), 455–461.
[18] Coskuner, O.; Wise-Scira, O. Structures and Free Energy Landscapes of the A53T Mutant-Type α-Synuclein Protein and Impact of A53T Mutation on the Structures of the Wild-Type α-Synuclein Protein with Dynamics. ACS Chem. Neurosci. 2013, 4(7), 1101–1113.
[19] Lamberto, G. R.; Binolfi, A.; Orcellet, M. L.; Bertoncini, C. W.; Zweckstetter, M.; Griesinger, C.; Fernández, C. O. Structural and Mechanistic Basis behind the Inhibitory Interaction of PcTS on α-Synuclein Amyloid Fibril Formation. Proc. Natl. Acad. Sci. U.S.A. 2009, 106(50), 21057–21062.
[20] Sidhu, A.; Segers-Nolten, I.; Raussens, V.; Claessens, M. M. A. E.; Subramaniam, V. Distinct Mechanisms Determine α-Synuclein Fibril Morphology during Growth and Maturation. ACS Chem. Neurosci. 2017, 8(3), 538–547.
[21] Milchberg, M. H.; Warmuth, O. A.; Borcik, C. G.; Dhavale, D. D.; Wright, E. R.; Kotzbauer, P. T.; Rienstra, C. M. Alpha-Synuclein Fibril Structures Cluster into Distinct Classes. bioRxiv 2025, 2025.04.30. 651534.
[22] Sivakumar, P.; Nagashanmugam, K. B.; Priyatharshni, S.; Lavanya, R.; Prabhu, N.; Ponnusamy, S. Review on the Interactions between Dopamine Metabolites and α-Synuclein in Causing Parkinson’s Disease. Neurochem. Int. 2023, 162, 105461.
[23] Hernandez, S. M.; Tikhonova, E. B.; Karamyshev, A. L. Protein-Protein Interactions in Alpha-Synuclein Biogenesis: New Potential Targets in Parkinson’s Disease. Front. Aging Neurosci. 2020, 12, 72.
[24] Masato, A.; Plotegher, N.; Terrin, F.; Sandre, M.; Faustini, G.; Thor, A.; Adams, S.; Berti, G.; Cogo, S.; De Lazzari, F.; Fontana, C. M.; Martinez, P. A.; Strong, R.; Bandopadhyay, R.; Bisaglia, M.; Bellucci, A.; Greggio, E.; Dalla Valle, L.; Boassa, D.; Bubacco, L. DOPAL Initiates αSynuclein-Dependent Impaired Proteostasis and Degeneration of Neuronal Projections in Parkinson’s Disease. Npj Park. Dis. 2023, 9(1), 42.
[25] Alpha-synuclein in Parkinson’s disease and other synucleinopathies: from overt neurodegeneration back to early synaptic dysfunction | Cell Death & Disease.
[26] Rostlab/prot_bert Hugging Face.
[27] Shaw, R.; Love, S. D.; McWhite, C. D. Evaluating Pretrained Protein Language Model Embeddings as Proxies for Functional Similarity. J. Mol. Evol. 2025, 93(6), 765–776.
[28] Tule, S.; Foley, G.; Bodén, M. Do Protein Language Models Learn Phylogeny? bioRxiv September 26, 2024, p 2024.09.23. 614642.
[29] Bahar, I.; Lezon, T. R.; Yang, L.-W.; Eyal, E. Global Dynamics of Proteins: Bridging Between Structure and Function. Annu. Rev. Biophys. 2010, 39(1), 23–42.
[30] Banerjee, P.; Eckert, A. O.; Schrey, A. K.; Preissner, R. ProTox-II: A Webserver for the Prediction of Toxicity of Chemicals. Nucleic Acids Res. 2018, 46(W1), W257–W263.
[31] Chia, S.; Faidon Brotzakis, Z.; Horne, R. I.; Possenti, A.; Mannini, B.; Cataldi, R.; Nowinska, M.; Staats, R.; Linse, S.; Knowles, T. P. J.; Habchi, J.; Vendruscolo, M. Structure-Based Discovery of Small-Molecule Inhibitors of the Autocatalytic Proliferation of α-Synuclein Aggregates. Mol. Pharm. 2023, 20(1), 183–193.
[32] Ardah, M. T.; Eid, N.; Kitada, T.; Haque, M. E. Ellagic Acid Prevents α-Synuclein Aggregation and Protects SH-SY5Y Cells from Aggregated α-Synuclein-Induced Toxicity via Suppression of Apoptosis and Activation of Autophagy. Int. J. Mol. Sci. 2021, 22(24), 13398.
[33] Kumar, S.; Kumar, R.; Kumari, M.; Kumari, R.; Saha, S.; Bhavesh, N. S.; Maiti, T. K. Ellagic Acid Inhibits α-Synuclein Aggregation at Multiple Stages and Reduces Its Cytotoxicity. ACS Chem. Neurosci. 2021, 12(11), 1919–1930.
[34] Tinku; Choudhary, S. Inhibition of α-Synuclein Fibrillation by Natural Polyphenols: Thermodynamic and Biophysical Aspects. J. Chem. Thermodyn. 2023, 177, 106951.
Cite This Article
  • APA Style

    Singh, N., Kumari, U., Mandra, D., Marouthu, A. (2026). Integrated Machine Learning-based Toxicity Prediction with Molecular Docking for Safer Drug Candidate Screening in Parkinson’s Disease. Computational Biology and Bioinformatics, 14(1), 41-53. https://doi.org/10.11648/j.cbb.20261401.14

    Copy | Download

    ACS Style

    Singh, N.; Kumari, U.; Mandra, D.; Marouthu, A. Integrated Machine Learning-based Toxicity Prediction with Molecular Docking for Safer Drug Candidate Screening in Parkinson’s Disease. Comput. Biol. Bioinform. 2026, 14(1), 41-53. doi: 10.11648/j.cbb.20261401.14

    Copy | Download

    AMA Style

    Singh N, Kumari U, Mandra D, Marouthu A. Integrated Machine Learning-based Toxicity Prediction with Molecular Docking for Safer Drug Candidate Screening in Parkinson’s Disease. Comput Biol Bioinform. 2026;14(1):41-53. doi: 10.11648/j.cbb.20261401.14

    Copy | Download

  • @article{10.11648/j.cbb.20261401.14,
      author = {Neha Singh and Uma Kumari and Dhanashri Mandra and Aashritha Marouthu},
      title = {Integrated Machine Learning-based Toxicity Prediction with Molecular Docking for Safer Drug Candidate Screening in Parkinson’s Disease},
      journal = {Computational Biology and Bioinformatics},
      volume = {14},
      number = {1},
      pages = {41-53},
      doi = {10.11648/j.cbb.20261401.14},
      url = {https://doi.org/10.11648/j.cbb.20261401.14},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.cbb.20261401.14},
      abstract = {Parkinson disease is a progressive neurodegenerative disorder marked by the abnormal buildup of α-synuclein into toxic fibrils, which lead to neuronal degeneration and motor problems. Among all the identified variants, the type 5A polymorphic structure (8PK4) has been strongly associated with disease progression and represents a promising therapeutic target for the development of safer and more effective drug candidates. In the present study, an integrated computational framework combining with molecular docking and machine-learning-based toxicity prediction was employed to identify potential natural compounds with high therapeutic efficacy and minimal toxic effects. Five bioactive phytochemicals, namely baicalein, rutin, ellagic acid, kaempferol, and ferulic acid, were selected based on their reported neuroprotective potential and screened against the α-synuclein target protein. Molecular docking analysis was performed using the CB-Dock platform to evaluate binding affinity, interaction stability, and residue-level interactions within the active binding pocket. The results demonstrated that all selected compounds exhibited favourable binding interactions with critical amino acid residues, particularly PHE4, LYS6, and GLU35, which are associated with α-synuclein aggregation and stabilization. Among the tested compounds, ellagic acid displayed the strongest binding affinity and the most stable interaction profile, suggesting enhanced inhibitory potential against the target protein. To further assess drug safety, toxicity predictions were performed using the ProTox-II machine-learning platform, evaluating multiple toxicity endpoints, including hepatotoxicity, neurotoxicity, mutagenicity, carcinogenicity, immunotoxicity, and cytochrome P450-mediated interactions. The toxicity assessment revealed that ellagic acid exhibited the lowest predicted toxicity among all screened compounds, while rutin showed a comparatively high LD50 value, indicating reduced acute toxicity and a favourable safety margin. The integration of molecular docking with artificial intelligence-driven toxicity prediction provides a rapid, cost-effective, and reliable strategy for safer drug candidate screening in Parkinson’s disease research. Overall, the study highlights the potential of natural compounds, particularly ellagic acid, as promising therapeutic leads for further experimental validation and future neuroprotective drug development.},
     year = {2026}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Integrated Machine Learning-based Toxicity Prediction with Molecular Docking for Safer Drug Candidate Screening in Parkinson’s Disease
    AU  - Neha Singh
    AU  - Uma Kumari
    AU  - Dhanashri Mandra
    AU  - Aashritha Marouthu
    Y1  - 2026/06/27
    PY  - 2026
    N1  - https://doi.org/10.11648/j.cbb.20261401.14
    DO  - 10.11648/j.cbb.20261401.14
    T2  - Computational Biology and Bioinformatics
    JF  - Computational Biology and Bioinformatics
    JO  - Computational Biology and Bioinformatics
    SP  - 41
    EP  - 53
    PB  - Science Publishing Group
    SN  - 2330-8281
    UR  - https://doi.org/10.11648/j.cbb.20261401.14
    AB  - Parkinson disease is a progressive neurodegenerative disorder marked by the abnormal buildup of α-synuclein into toxic fibrils, which lead to neuronal degeneration and motor problems. Among all the identified variants, the type 5A polymorphic structure (8PK4) has been strongly associated with disease progression and represents a promising therapeutic target for the development of safer and more effective drug candidates. In the present study, an integrated computational framework combining with molecular docking and machine-learning-based toxicity prediction was employed to identify potential natural compounds with high therapeutic efficacy and minimal toxic effects. Five bioactive phytochemicals, namely baicalein, rutin, ellagic acid, kaempferol, and ferulic acid, were selected based on their reported neuroprotective potential and screened against the α-synuclein target protein. Molecular docking analysis was performed using the CB-Dock platform to evaluate binding affinity, interaction stability, and residue-level interactions within the active binding pocket. The results demonstrated that all selected compounds exhibited favourable binding interactions with critical amino acid residues, particularly PHE4, LYS6, and GLU35, which are associated with α-synuclein aggregation and stabilization. Among the tested compounds, ellagic acid displayed the strongest binding affinity and the most stable interaction profile, suggesting enhanced inhibitory potential against the target protein. To further assess drug safety, toxicity predictions were performed using the ProTox-II machine-learning platform, evaluating multiple toxicity endpoints, including hepatotoxicity, neurotoxicity, mutagenicity, carcinogenicity, immunotoxicity, and cytochrome P450-mediated interactions. The toxicity assessment revealed that ellagic acid exhibited the lowest predicted toxicity among all screened compounds, while rutin showed a comparatively high LD50 value, indicating reduced acute toxicity and a favourable safety margin. The integration of molecular docking with artificial intelligence-driven toxicity prediction provides a rapid, cost-effective, and reliable strategy for safer drug candidate screening in Parkinson’s disease research. Overall, the study highlights the potential of natural compounds, particularly ellagic acid, as promising therapeutic leads for further experimental validation and future neuroprotective drug development.
    VL  - 14
    IS  - 1
    ER  - 

    Copy | Download

Author Information
  • Abstract
  • Keywords
  • Document Sections

    1. 1. Introduction
    2. 2. Materials and Methods
    3. 3. Results and Discussion
    4. 4. Conclusions and Future Direction
    Show Full Outline
  • Author Contributions
  • Conflicts of Interest
  • References
  • Cite This Article
  • Author Information