We introduce iLearnPlus, the first machine-learning platform with graphical- and web-based interfaces for the construction of machine-learning pipelines for analysis and predictions using nucleic acid and protein sequences. It provides a comprehensive set of algorithms and automates sequence-based feature extraction and analysis, construction and deployment of models, assessment of predictive performance, statistical analysis, and visualization; all without any programming. iLearnPlus includes 147 unique feature sets which encode information from the input sequences and 21 machine-learning algorithms with 7 deep-learning approaches, outnumbering the current solutions by a wide margin. Our solution caters to experienced bioinformaticians, given the broad range of options, and biologists with no programming background, given the point-and-click interface and easy-to-follow design process.


 
             
 
iLearnPlus Interface
 
 
 
iLearnPlus Functionality
 
  • iLearnPlus Functionality
    • Extract Descriptors (147)
      • DNA Descriptors (46)
        • Kmer
        • RCKmer
        • MisMatch
        • Subsequence
        • NAC
        • ANF
        • ENAC
        • binary
        • PS2
        • PS3
        • PS4
        • CKSNAP
        • NCP
        • PSTNPss
        • PSTNPds
        • EIIP
        • PseEIIP
        • ASDC
        • DBE
        • LPDF
        • DPCP
        • DPCP type2
        • TPCP
        • TPCP type2
        • MMI
        • KNN
        • Z_curve_9bit
        • Z_curve_12bit
        • Z_curve_36bit
        • Z_curve_48bit
        • Z_curve_144bit
        • NMBroto
        • Moran
        • Geary
        • DAC
        • DCC
        • DACC
        • TAC
        • TCC
        • TACC
        • PseDNC
        • PseKNC
        • PCPseDNC
        • PCPseTNC
        • SCPseDNC
        • SCPseTNC
      • RNA Descriptors (35)
        • Kmer
        • Mismatch
        • Subsequence
        • NAC
        • ANF
        • NCP
        • PSTNPss
        • ENAC
        • binary
        • PS2
        • PS3
        • PS4
        • CKSNAP
        • ASDC
        • DBE
        • LPDF
        • DPCP
        • DPCP type2
        • MMI
        • KNN
        • Z_curve_9bit
        • Z_curve_12bit
        • Z_curve_36bit
        • Z_curve_48bit
        • Z_curve_144bit
        • NMBroto
        • Moran
        • Geary
        • DAC
        • DCC
        • DACC
        • PseDNC
        • PseKNC
        • PCPseDNC
        • SCPseDNC
      • Protein Descriptors (66)
        • AAC
        • EAAC
        • CKSAAP
        • DPC
        • DDE
        • TPC
        • binary
        • binary_6bit
        • binary_5bit type 1
        • binary_5bit type 2
        • binary_3bit type 1
        • binary_3bit type 2
        • binary_3bit type 3
        • binary_3bit type 4
        • binary_3bit type 5
        • binary_3bit type 6
        • binary_3bit type 7
        • AESNN3
        • GAAC
        • EGAAC
        • CKSAAGP
        • GDPC
        • GTPC
        • AAIndex
        • ZScale
        • BLOSUM62
        • NMBroto
        • Moran
        • Geary
        • CTDC
        • CTDT
        • CTDD
        • CTriad
        • KSCTriad
        • SOCNumber
        • QSOrder
        • PAAC
        • APAAC
        • OPF_10bit
        • OPF_7bit type 1
        • OPF_7bit type 2
        • OPF_7bit type 3
        • ASDC
        • KNN
        • DistancePair
        • AC
        • CC
        • ACC
        • PSEKRAAC type 1
        • PSEKRAAC type 2
        • PSEKRAAC type 3A
        • PSEKRAAC type 3B
        • PSEKRAAC type 4
        • PSEKRAAC type 5
        • PSEKRAAC type 6A
        • PSEKRAAC type 6B
        • PSEKRAAC type 6C
        • PSEKRAAC type 7
        • PSEKRAAC type 8
        • PSEKRAAC type 9
        • PSEKRAAC type 10
        • PSEKRAAC type 11
        • PSEKRAAC type 12
        • PSEKRAAC type 13
        • PSEKRAAC type 14
        • PSEKRAAC type 15
        • PSEKRAAC type 16
  • Clustering algorithms (10)
    • kmeans
    • MiniBatchKMeans
    • GM
    • Agglomerative
    • Spectral
    • MCL
    • hcluster
    • APC
    • meanshift
    • DBSCAN
  • Dimensionality Reduction algorithms (3)
    • PCA
    • t_SNE
    • LDA
  • Feature Selection algorithms (5)
    • CHI2
    • IG
    • FScore
    • MIC
    • Pearsonr
  • Feature Normalization methods (2)
    • ZScore
    • MinMax
  • Machine Learning algorithms (21)
    • RF
    • DecisionTree
    • LightGBM
    • SVM
    • MLP
    • XGBoost
    • KNN
    • LR
    • LDA
    • QDA
    • SGD
    • NAiveBayes
    • Bagging
    • AdaBoost
    • GBDT
    • CNN
    • RNN
    • BRNN
    • ABCNN
    • ResNet
    • AE
  • Estimate descriptors performance
  • Estimate machine learnning algorithms performance
  • Model integration
  • Plots (7)
    • ROC curve
    • PRC curve
    • Scatter plot (Clustering & Dimensionality reduction)
    • Histogram (Data distribution)
    • Kernel density plot (Data distribution)
    • Box plot
    • Heat map
  • Statistical analysis methods (2)
    • Student 's t-test
    • Boostrap test
  • Performance evaluation metrics (8)
    • Sensitivity (Recall)
    • Specificity
    • Accuracy
    • Matthews correlation coefficient
    • Precision
    • F1-Score
    • AUROC
    • AUPRC
 
 
Who are using?