- Open Access
- Article
Biclustering Results Visualization of Gene Expression Data: A Review
by Haithem Aouabed 1 , Mourad Elloumi 2 and Fahad Algarni 2
1 University of Sfax, Department of Computer Sciences, Faculty of Economic Sciences and Management, Sfax, 3018, Tunisia
2 University of Bisha, Department of Computer Sciences and artificial intelligence, College of Computing and Information Technology, Bisha, 67714, Saudi Arabia
* Author to whom correspondence should be addressed.
Journal of Engineering Research and Sciences, Volume 3, Issue 10, Page # 55-68, 2024; DOI: 10.55708/js0310006
Keywords: Biclustering algorithms, Biclusters, Overlaps, Visualization, Visualization techniques
Received: 26 August 2024, Revised: 09 October 2024, Accepted: 10 October 2024, Published Online: 29 November 2024
(This article belongs to the Special Issue Special Issue on Multidisciplinary Sciences and Advanced Technology 2024 & Section Biochemical Research Methods (BRM))
APA Style
Aouabed, H., Elloumi, M., & Algarni, F. (2024). Biclustering results visualization of gene expression data: A review. Journal of Engineering Research and Sciences, 3(10), 55-68. https://doi.org/10.55708/js0310006
Chicago/Turabian Style
Aouabed, Haithem, Mourad Elloumi, and Fahad Algarni. “Biclustering Results Visualization of Gene Expression Data: A Review.” Journal of Engineering Research and Sciences 3, no. 10 (2024): 55-68. https://doi.org/10.55708/js0310006.
IEEE Style
H. Aouabed, M. Elloumi, and F. Algarni, “Biclustering results visualization of gene expression data: A review,” Journal of Engineering Research and Sciences, vol. 3, no. 10, pp. 55-68, 2024, doi: 10.55708/js0310006.
Biclustering is a non-supervised data mining method used to analyze gene expression data by identifying groups of genes that exhibit similar patterns across specific groups of conditions. Discovering these co-expressed genes (called biclusters) can aid in understanding gene interactions in various biological contexts. Biclustering is characterized by its bi-dimensional nature, grouping both genes and conditions in the same bicluster and its overlapping property, allowing genes to belong to multiple biclusters. Biclustering algorithms often produce a large number of overlapping biclusters. Visualizing these results is not a straightforward task due to the specific characteristics of biclusters. In fact, biclustering results visualization is a crucial process to infer patterns from the expression data. In this paper, we explore the various techniques for visualizing multiple biclusters simultaneously and we evaluate them in order to help biologists to better choose their appropriate visualization techniques.
- M.B. Eisen, P.T. Spellman, P.O. Brown, D. Botstein, “Cluster analysis and display of genome-wide expression patterns,” Proceedings of the National Academy of Sciences, vol. 95, no. 25, 1998, doi.org/10.1073/pnas.95.25.1486
- R.. Sokal, C.. Michener, “A statistical method for evaluating systematic relationships,” Univ. Kansas, Sci. Bull., vol. 38, , pp. 1409–1438, 1958.
- J.A. Hartigan, M.A. Wong, Algorithm AS 136: A K-Means Clustering Algorithm, 1979, doi.org/10.2307/2346830
- Y. Cheng, G.M. Church, “Biclustering of expression data,” Proceedings. International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 93–103, 2000.
- S.C. Madeira, A.L. Oliveira, “Biclustering algorithms for biological data analysis: a survey,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 24–45, 2004, doi:10.1109/TCBB.2004.2.
- B. Pontes, R. Giráldez, J.S. Aguilar-Ruiz, “Biclustering on expression data: A review,” Journal of Biomedical Informatics, vol. 57, pp. 163–180, 2015, doi:10.1016/j.jbi.2015.06.028.
- C. Ware, Information visualization : perception for design, Morgan Kaufman, 2004.
- B.J. Fry, Computational information design, Massachusetts Institute of Technology Cambridge, MA, USA, 2004.
- J.J. Thomas, K.A. Cook, Illuminating the path, IEEE Computer Society, 2005.
- D. Keim, K. Jörn, G. Ellis, M. Florian, Mastering the information age : solving problems with visual analytics, Eurographics Association, 2010.
- A. Holzinger, Human-Computer Interaction and Knowledge Discovery (HCI-KDD): What Is the Benefit of Bringing Those Two Fields to Work Together?, Springer, Berlin, Heidelberg, vol.8127, pp 319–328, 2013, doi:10.1007/978-3-642-40511-2_22.
- W. Ayadi, M. Elloumi, Biological Knowledge Visualization, John Wiley & Sons, Inc., Hoboken, New Jersey: 651–661, 2011.
- A. Inselberg, “The plane with parallel coordinates,” The Visual Computer, vol. 1, no. 2, pp. 69–91, 1985, doi:10.1007/BF01898350.
- D. Gonçalves, R.S. Costa, R. Henriques, “Context-situated visualization of biclusters to aid decisions: going beyond subspaces with parallel coordinates,” ACM International Conference Proceeding Series, no. 9, pp. 1–5, doi:10.1145/3531073.3531124.
- N.K. Verma, T. Sharma, S. Dixit, P. Agrawal, S. Sengupta, V. Singh, “BIDEAL: A Toolbox for Bicluster Analysis—Generation, Visualization and Validation,” SN Computer Science, vol. 2, no. 1, 2021, doi:10.1007/S42979-020-00411-9.
- M. Sözdinler, “A Review of Visualization Methods and Tools for the Biclustering,” International Journal of Innovative Science and Research Technology, vol. 6, 2021, doi.org/10.48550/arXiv.2111.12154.
- H. Aouabed, R. Santamaria, M. Elloumi, “Visualizing biclustering results on gene expression data: A survey,” ACM International Conference Proceeding Series, pp. 170–179, 2021, doi:10.1145/3473258.3473284.
- H. Aouabed, M. Elloumi, R. Santamaría, “An evaluation study of biclusters visualization techniques of gene expression data,” Journal of Integrative Bioinformatics, vol. 18, no. 4, 2021, doi:10.1515/JIB-2021-0019/MACHINEREADABLECITATION/RIS.
- R. Santamaria, Visual analysis of gene expression data by means of biclustering, University of Salamanca, Spain, 2009.
- A.V. Freitas, W. Ayadi, M. Elloumi, J. Oliveira, J. Oliveira, J.-K. Hao, Survey on Biclustering of Gene Expression Data, John Wiley & Sons, Inc., Hoboken, New Jersey: 591–608, 2012, doi:10.1002/9781118617151.ch25.
- H. Ben Saber, M. Elloumi, “Dna Microarray Data Analysis: a New Survey on Biclustering,” International Journal for Computational Biology, vol. 4, no. 1, pp. 21, 2015, doi:10.34040/ijcb.4.1.2014.36.
- S.C. Madeira, A.L. Oliveira, “Biclustering algorithms for biological data analysis: a survey,” IEEE Transactions on Computational Biology and Bioinformatics, vol. 1, no. 1, pp. 24–45, 2004.
- V.A. Padilha, R.J.G.B. Campello, “A systematic comparative evaluation of biclustering techniques,” Padilha Campello BMC Bioinforma., vol. 18, , 2017, doi:10.1186/s12859-017-1487-1.
- G. Getz, E. Levine, E. Domany, “Coupled two-way clustering analysis of gene microarray data.,” Proceedings of the National Academy of Sciences of the United States of America, vol. 97, no. 22, pp. 12079–84, 2000, doi:10.1073/pnas.210134797.
- C. Tang, L. Zhang, A. Zhang, M. Ramanathan, “Interrelated two-way clustering: an unsupervised approach for gene expression data analysis,” in Proceedings 2nd Annual IEEE International Symposium on Bioinformatics and Bioengineering (BIBE 2001), IEEE: 41–48, 2001, doi:10.1109/BIBE.2001.974410.
- S. Busygin, G. Jacobsen, E. Krämer, “Double Conjugated Clustering Applied to Leukemia Microarray Data,” IN 2ND SIAM ICDM, WORKSHOP ON CLUSTERING HIGH DIMENSIONAL DATA, 2002.
- J.A. Hartigan, “Direct Clustering of a Data Matrix,” Journal of the American Statistical Association, vol. 67, no. 337, pp. 123, 1972, doi:10.2307/2284710.
- A. Prelić, S. Bleuler, P. Zimmermann, A. Wille, P. Bühlmann, W. Gruissem, L. Hennig, L. Thiele, E. Zitzler, “A systematic comparison and evaluation of biclustering methods for gene expression data,” Bioinformatics, vol. 22, no. 9, pp. 1122–1129, 2006, doi:10.1093/bioinformatics/btl060.
- A. Ben-Dor, B. Chor, R. Karp, Z. Yakhini, “Discovering local structure in gene expression data,” Proceedings of the Sixth Annual International Conference on Computational Biology – RECOMB ’02, pp. 49–57, 2002, doi:10.1145/565196.565203.
- T.M. Murali, S. Kasif, “Extracting conserved gene expression motifs from gene expression data.,” Pacific Symposium on Biocomputing., vol. 88, , pp. 77–88, 2003, doi:10.1142/9789812776303_0008.
- S. Bergmann, J. Ihmels, N. Barkai, “Iterative signature algorithm for the analysis of large-scale gene expression data,” Physical Review E, vol. 67, no. 3 1, pp. 031902/1-031902/18, 2003, doi:10.1103/PhysRevE.67.031902.
- H. Cho, I.S. Dhillon, Y. Guan, S. Sra, “Minimum Sum-Squared Residue Co-clustering of Gene Expression Data,” in Proceedings of the 2004 SIAM International Conference on Data Mining, Society for Industrial and Applied Mathematics, Philadelphia, PA: 114–125, 2004, doi:10.1137/1.9781611972740.11.
- G. Li, Q. Ma, H. Tang, A.H. Paterson, Y. Xu, “QUBIC: A qualitative biclustering algorithm for analyses of gene expression data,” Nucleic Acids Research, vol. 37, no. 15, 2009, doi:10.1093/nar/gkp491.
- C. Huttenhower, K. Tsheko Mutungu, N. Indik, W. Yang, M. Schroeder, J.J. Forman, O.G. Troyanskaya, H.A. Coller, “Detailing regulatory networks through large scale data integration,” Bioinformatics, vol. 25, no. 24, pp. 3267–3274, 2009, doi:10.1093/bioinformatics/btp588.
- D. Bozdag, J.D. Parvin, U. V Catalyurek, “A biclustering method to discover co-regulated genes using diverse gene expression datasets,” International Conference on Bioinformatics and Computational Biology, vol. 5462 LNBI, , pp. 151–163, 2009, doi:10.1007/978-3-642-00727-9_16.
- A.A. Shabalin, V.J. Weigman, C.M. Perou, A.B. Nobel, “Finding large average submatrices in high dimensional data,” The Annals of Applied Statistics, vol. 3, no. 3, pp. 985–1012, 2009, doi:10.1214/09-AOAS239.
- A. Tanay, R. Sharan, R. Shamir, “Discovering statistically significant biclusters in gene expression data.,” Bioinformatics, vol. 18 Suppl 1, , pp. S136-S144, 2002, doi:10.1093/bioinformatics/18.suppl_1.S136.
- D.S. Rodriguez-Baena, A.J. Perez-Pulido, J.S. Aguilar-Ruiz, “A biclustering algorithm for extracting bit-patterns from binary datasets,” Bioinformatics, vol. 27, no. 19, pp. 2738–2745, 2011, doi:10.1093/bioinformatics/btr464.
- A. Serin, M. Vingron, “DeBi: Discovering Differentially Expressed Biclusters using a Frequent Itemset Approach.,” Algorithms Molucular Biology, vol. 6, no. 1, pp. 18, 2011, doi:10.1186/1748-7188-6-18.
- L. Lazzeroni, A. Owen, “Plaid Models for Gene Expression Data,” CEUR Workshop Proc., vol. 1542, , pp. 33–36, 2000, doi:10.1017/CBO9781107415324.004.
- Y. Kluger, R. Basri, J.T. Chang, M. Gerstein, “Spectral Biclustering of Microarray Data: Coclustering Genes and Conditions,” Genome Research, vol. 13, pp. 703–716, 2003, doi:10.1101/gr.648603.graph.
- J. Gu, J.S. Liu, “Bayesian biclustering of gene expression data,” BMC Genomics, vol. 9 Suppl 1, pp. S4, 2008, doi:10.1186/1471-2164-9-S1-S4.
- S. Hochreiter, U. Bodenhofer, M. Heusel, A. Mayr, A. Mitterecker, A. Kasim, T. Khamiakova, S. van Sanden, D. Lin, W. Talloen, L. Bijnens, H.W.H. Göhlmann, Z. Shkedy, D.A. Clevert, “FABIA: Factor analysis for bicluster acquisition,” Bioinformatics, vol. 26, no. 12, pp. 1520–1527, 2010, doi:10.1093/bioinformatics/btq227.
- S. Barkow, S. Bleuler, A. Prelić, P. Zimmermann, E. Zitzler, “BicAT: A biclustering analysis toolbox,” Bioinformatics, vol. 22, no. 10, pp. 1282–1283, 2006, doi:10.1093/bioinformatics/btl099.
- R. Santamaría, R. Therón, L. Quintales, “A visual analytics approach for understanding biclustering results from microarray data,” BMC Bioinformatics, vol. 9, no. 1, pp. 247, 2008, doi:10.1186/1471-2105-9-247.
- R. Jin, Y. Xiang, D. Fuhry, F.F. Dragan, “Overlapping Matrix Pattern Visualization: A Hypergraph Approach,” in 2008 Eighth IEEE International Conference on Data Mining, IEEE: 313–322, 2008, doi:10.1109/ICDM.2008.102.
- A. Luscher, G. Csardi, A. Morton de Lachapelle, Z. Kutalik, B. Peter, S. Bergmann, “ExpressionView–an interactive viewer for modules identified in gene expression data,” Bioinformatics, vol. 26, no. 16, pp. 2062–2063, 2010, doi:10.1093/bioinformatics/btq334.
- G.A. Grothaus, A. Mufti, T. Murali, “Automatic layout and visualization of biclusters,” Algorithms for Molecular Biology, vol. 1, no. 1, pp. 15, 2006, doi:10.1186/1748-7188-1-15.
- S. Batzoglou, S. Istrail, Physical Mapping with Repeated Probes: The Hypergraph Superstring Problem, Springer, Berlin, Heidelberg: 66–77, 1999, doi:10.1007/3-540-48452-3_5.
- K.S. Booth, G.S. Lueker, “Testing for the consecutive ones property, interval graphs, and graph planarity using PQ-tree algorithms,” Journal of Computer and System Sciences, vol. 13, no. 3, pp. 335–379, 1976, doi:10.1016/S0022-0000(76)80045-1.
- J. Heinrich, R. Seifert, M. Burch, D. Weiskopf, BiCluster Viewer: A Visualization Tool for Analyzing Gene Expression Data, Springer, Berlin, Heidelberg: 641–652, 2011, doi:10.1007/978-3-642-24028-7_59.
- S. Kaiser, R. Santamaria, T. Khamiakova, M. Sill, R. Theron, L. Quintales, F. Leisch, E. De, T. Maintainer, “biclust: BiCluster Algorithms. R package version 1.0.2,” 2013.
- H. Aouabed, R. Santamaría, M. Elloumi, Suitable Overlapping Set Visualization Techniques and Their Application to Visualize Biclustering Results on Gene Expression Data, Springer, Cham: 191–201, 2018, doi:10.1007/978-3-319-99133-7_16.
- M. Streit, S. Gratzl, M. Gillhofer, A. Mayr, A. Mitterecker, S. Hochreiter, “Furby: fuzzy force-directed bicluster visualization.,” BMC Bioinformatics, vol. 15 Suppl 6, no. Suppl 6, pp. S4, 2014, doi:10.1186/1471-2105-15-S6-S4.
- H. Aouabed, R. Santamaria, M. Elloumi, “VisBicluster: A Matrix-Based Bicluster Visualization of Expression Data,” Journal of Computational Biology, pp. cmb.2019.0385, 2020, doi:10.1089/cmb.2019.0385.
- H. Aouabed, M. Elloumi, “Visualizing Biclusters of Gene Expression Data and Their Overlaps Based on a Two-Dimensional Matrix Technique,” Computational Biology and Bioinformatics 2023, vol. 11, no. 2, pp. 19–32, 2023, doi:10.11648/J.CBB.20231102.11.
- M.E. Baron, “A Note on the Historical Development of Logic Diagrams: Leibniz, Euler and Venn,” The Mathematical Gazette, vol. 53, no. 384, pp. 113, 1969, doi:10.2307/3614533.
- A. Lex, N. Gehlenborg, H. Strobelt, R. Vuillemot, H. Pfister, “UpSet: Visualization of intersecting sets,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 1983–1992, 2014, doi:10.1109/TVCG.2014.2346248.
- V.I. Levenshtein, “Binary Codes Capable of Correcting Deletions, Insertions and Reversals,” Sov. Phys. Dokl. Vol. 10, p.707, vol. 10, , pp. 707, 1966.
- M. Bostock, V. Ogievetsky, J. Heer, “D3 Data-Driven Documents,” IEEE Trans. Vis. Comput. Graph., vol. 17, no. 12, pp. 2301–2309, 2011, doi:10.1109/TVCG.2011.185.
- J. Heinrich, M. Burch, R. Seifert, D. Weiskopf, “BiCluster Viewer : A Visualization Tool for Analyzing Gene Expression Data BiCluster Viewer : A Visualization Tool for Analyzing Gene Expression Data,” 2011.
- S. Varambally, J. Yu, B. Laxman, D.R. Rhodes, R. Mehra, S.A. Tomlins, R.B. Shah, U. Chandran, F.A. Monzon, M.J. Becich, J.T. Wei, K.J. Pienta, D. Ghosh, M.A. Rubin, A.M. Chinnaiyan, “Integrative genomic and proteomic analysis of prostate cancer reveals signatures of metastatic progression,” Cancer Cell, vol. 8, no. 5, pp. 393–406, 2005, doi:10.1016/j.ccr.2005.10.001.
- A. Bhattacharjee, W.G. Richards, J. Staunton, C. Li, S. Monti, P. Vasa, C. Ladd, J. Beheshti, R. Bueno, M. Gillette, M. Loda, G. Weber, E.J. Mark, E.S. Lander, W. Wong, B.E. Johnson, T.R. Golub, D.J. Sugarbaker, M. Meyerson, “Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses,” Proceedings of the National Acad Sciences U. S. A., vol. 98, no. 24, 13790–13795, 2001, doi:10.1073/pnas.191502998.
- A.I. Su, M.P. Cooke, K.A. Ching, Y. Hakak, J.R. Walker, T. Wiltshire, A.P. Orth, R.G. Vega, L.M. Sapinoso, A. Moqrich, A. Patapoutian, G.M. Hampton, P.G. Schultz, J.B. Hogenesch, “Large-scale analysis of the human and mouse transcriptomes,” Proceedings of the National Academy of Sciences of the United States of America, vol. 99, no. 7, 4465–4470, 2002, doi:10.1073/pnas.012025199.