Lorena, A.C., Maciel I,  de Miranda P.B.C., Costa I.G., Prudêncio, R.B.C., Data Complexity Meta-features for Regression Problems. Machine Learning, accepted.

Axelsson A.S., Mahdi T. , Nenonen H.A. , Singh T., Hänzelmann S. , … ,  Costa I.G., Zhang E.,  Rosengren A.H., Sox5 regulates beta-cell phenotype and is reduced in type 2 diabetes, Nature Communications, 15652. [paper].

Hohwieler, M., Illing, A., Hermann, P. C., Mayer, T., Stockmann, M., Perkhofer, L., …, Costa, I. G. Seufferlein, T., Kormann, M., Wagner, M., Liebau, S., Kleger, A. (2017). Human pluripotent stem cell-derived acinar/ductal organoids generate human pancreas upon orthotopic transplantation and allow disease modelling. Gut, 66:473:486. [paper]


Gusmao E.G., Allhoff, M., Zenke, M.,  Costa, I.G. (2016), Analysis of computational footprinting methods for DNase sequencing experiments, Nature Methods, 13, 303–309 [paper][pre-print][supp].

Kalwa M., Hänzelmann S., Otto S.,  Kuo C.C., Franzen J., … , Elmar Weinhold Costa I. G.*, Wagner W.*The lncRNA HOTAIR impacts on mesenchymal stem cells via triple helix formation, Nucleic Acids Research, 44 (22): 10631-10643. [paper]. (*shared corresponding author).

Allhoff, M., Sere K., Freitas, J., Zenke, M.,  Costa, I.G. (2016), Differential Peak Calling of ChIP-seq Signals with Replicates with THOR, Nucleic Acids Research, 44(20):e153.  [paper][supp][data].

Nascimento, A. C., Prudencio, R.B.C, Costa I.G., A Multiple Kernel Learning algorithm for drug-target interaction prediction, BMC Bioinformatics,17:46 [paper][supp].

Kolovos, P.,  Georgomanolis, T.,  Koeferle A.,  Larkin J. DBrant L.,  Nikolic M., Gusmao E.G., Zirkel A.,  Knoch T. A., Costa I.G., van Ijcken W. F ,  Cook P. R., Grosveld F. G.Papantonis A., Binding of nuclear factor kappa B to non-canonical consensus sites reveals its multimodal role during the early inflammatory response, Genome Research,  26:1478-1489 [paper].

de Almeida, D., Ferreira, M. R. P., Franzen, J., Weidner, C., Frobel, J., Zenke, M. Costa, I.G., Wagner, W. (2016). Epigenetic Classification of Human Mesenchymal Stromal Cells. Stem Cell Reports, 6(2):68-175 [paper].

Schemionek, M., Herrmann, O., Merle Reher, M., Chatain, N., Schubert, C., Costa, I. G., Gusmao, E.G., … Koschmieder, S. (2015). MTSS1 is a critical epigenetically regulated tumor suppressor in CML. Leukemia, 30(4):823-32. [paper].

Perkhofer L.,  Walter K., Costa I.G., Bergmann W., Eiseler T., Genze F., Hafner S., Zenke M.,  Seufferlein T.,  Hermann P. C.,  Mueller M., Kleger A., TBX3 drives stemness via increased Activin/Nodal-signalling in human pancreatic ductal adenocarcinoma, Stem Cell Research, 17(2):367-378 [paper].

 Lin Q., Weidner C. I., Costa I. G., Marioni, R., Ferreira M. R. P. , Deary I. J., Wagner W., DNA methylation levels at individual age-associated CpG sites can be indicative for life expectancy?, Aging, 8(2):394-401 [paper].

Eipel M., Mayer F., Arent T., Ferreira M. R. P., Birkhofer C., Gerstenmaier W., Costa I.G.,  Ritz- Timme S.,  Wagner W. (2016),Epigenetic age predictions based on buccal swabs are more precise in combination with cell type-specific DNA methylation signatures , Aging, 8(5):1034-48[paper].

Souza, M.R.B., Araujo G., Costa, I.G.,  Oliveira, J.R.O, Combined genome-wide CSF Aβ-42’s associations and simple network properties highlight new risk factors for Alzheimer´s Disease, Journal of Molecular Neuroscience, 58 (1), 120-128.

Berger A.W., Schwerdel, D. Costa I. G. , Hackert T., Giese N.A., Lam S., Barth T. F. ,  Schroppel B., Meining A., Buchler M. W. , Zenke M. ,  Hermann P. C., Seufferlein T. , Kleger A., Circulating cell-free DNA is a reliable tool to detect hot spot mutations in intraductal papillary mucinous neoplasms of the pancreas, Gastroenterology, 51(2):267-270. [paper].


Lin Q*, Chauvistre H*, Costa IG*, Mitzka S, Gusmao EG, Haenzelmann S, Baying B, Hennuy B, Smeets H, Hoffmann K, Benes V, Sere K, Zenke M, Epigenetic and Transcriptional Architecture of Dendritic Cell Development, Nucleic Acids Research, 43:9680-9693, [paper][data][genome tracks] (*shared first co-author).

Haenzelmann S, Beier F., Gusmao EG ,  Koch CM, Hummel S., Charapitsa I, Joussen S, Benes V, Brümmendorf TH , Reid G, Costa, IG*, Wagner, W* (2015) Replicative Senescence is Associated with Nuclear Reorganization and with DNA Methylation at Specific Transcription Factor Binding Sites, Clinical Epigenetics, 7:9 [paper](shared corresponding author).

de Souto M. C. P.,  Jaskowiak P. A., Costa IG , (2015) Impact of Missing Data Imputation Methods on Gene Expression Clustering and Classification Journal, BMC Bioinformatics, 16:64 [paper][supp].

Hanzelmann S, Wang J, Guney E, Tang Y,  Zhang E, Axelsson AS, Nenonen H,  Salehi AS,  Wollheim CB, Zetterberg E, Berntorp E., Costa IG, Castelo R, Rosengren AH, Thrombin stimulates insulin secretion via protease-activated receptor-3, Islets, 7(4):e1118195.

Müller M, Schröer J, Azoitei N, Eiseler T, Bergmann W, Köhntop R, Lin Q, Costa IG, Zenke M, Genze F, Weidgang C, Seufferlein T, Liebau S, Kleger A., A time frame permissive for Protein Kinase D2 activity to direct angiogenesis in mouse embryonic stem cells, Scientific Reports, 5:11742 [paper].


Allhoff, M., Sere, K., Chauvistre, H., Lin, Q., Zenke M, Costa IG, Detecting differential peaks in ChIP-seq signals with ODIN, Bioinformatics, 30, 3467-3475 [paper][data][supp].

Gusmao EG, Dieterich C, Zenke M, Costa IG, Detection of Active Transcription Factor Binding Sites with the Combination of DNase Hypersensitivity and Histone Modifications, Bioinformatics, 30(22):3143-51. [paper] [Supp].

Jaskowiak, PA , Campello, RJGB, Costa, IG., On The Selection of Appropriate Distances for Gene Expression Data Clustering, BMC Bioinformatics, 15(Suppl 2):S2[paper].

Ullius A., Luscher-Firzlaf , Costa IG, Walsemann, G., Forst, A., Gusmao EG, Kapelle, K., Kleine H., Kremmer K., Vervoorts J., Luscher, B, The interaction of MYC with the trithorax protein ASH2L promotes gene transcription by regulating H3K27 modification, Nucleic Acids Research,  42 (11): 6901-6920 [paper].

Gusmao EG, de Souto, MCP, Issues on Sampling Negative Examples for Predicting Prokaryotic Promoters, Proceedings of the IJCNN, 494-501, [paper].


Jaskowiak, PA , Campello, RJGB, Costa, IG., Proximity Measures for Clustering Gene Expression Microarray Data: a Validation Methodology and a Comparative Analysis, IEEE Transaction in Computational Biology and Bioinformatics,  10(4)  , 845-857. [paper].

Weidner CI., Walenda T., Lin Q., Wölfler MM., Denecke B., Costa IG., Zenke M., Wagner W., Hematopoietic Stem and Progenitor Cells Acquire Distinct DNA-Hypermethylation During in vitro Culture, Scientific Reports, 3:3372, 2013.

Allhoff M., Schoenhuth, A.,Martin, M., Costa, IG.,Rahmann, S., Marschall, T, Discovering Context-Specific Sequencing Errors BMC Bioinformatics (RECOMB-Seq 2013) , 14(Supp 5):S1. [paper] .

Hänzelmann, S.,Castelo, R.,Guinney, J., GSVA: gene set variation analysis for microarray and RNA-Seq data BMC Bioinformatics , 14:7,2013. [paper] .

Araujo G., Costa, I.G.,  Souza, M.R.B. Oliveira, J.R.O., Random Forests and Gene Networks for Association of SNPs to Alzheimer’s Disease, Lecture Notes in Bioinformatics, Volume 8213, 104-115.


Marschall, T., Costa, I. G., Canzar, S., Bauer, M., Klau, G., Schliep, A, Schoenhuth, A.. CLEVER: Clique-Enumerating Variant Finder. Bioinformatics, 28(22):2875-2882, 2012. [paper] [F1000].

do Rego, T. G., Roider, H,  de Carvalho, F. A. T., Costa, I.G., Inferring Epigenetic and Transcriptional Regulation during Blood Cell Development with a Mixture of Sparse Linear Models, Bioinformatics, v. 28(18), p.2297-2303, 2012. [paper].

Mahdi T, Hänzelmann S, Salehi A, et. al., Secreted frizzled-related protein 4 reduces insulin secretion and is overexpressed in type 2 diabetes., Cell Metabolism, 7:16(5):625-33, 2012. [paper].

Lorena, A. C., Costa , I. G., Spoloir, N., de Souto, M. C. P. Analysis of Complexity Indices for Classification Problems: Cancer Gene Expression Data. Neurocomputing, v. 75(1), p. 33-42, 2012. [paper]

Gusmao, E. G, Dietrich, C., Costa, I. G.,  Prediction of Transcription Factor Binding Sites by Integrating DNase digestion and Histone Modification, Lecture Notes in Bioinformatics,, v.7409, p.109-119, 2012.

Jaskowiak, P. A., Campelo, R. J. G. B., Costa, I. G. Evaluating Correlation Coefficients for Clustering Gene Expression Profiles of Cancer. Lecture Notes in Bioinformatics, v.7409, p.120-131, 2012.

de Souto, M. C. P., Coelho, A. L. V., Faceli, K., Sakata, T. C., Bonadia, V., Costa, I. G. A comparison of external clustering evaluation indices in the context of imbalanced data sets. Proc. of the Brazilian Symposium on Neural Networks. IEEE, 49-54 , 2012.

Macario Filho, V., de Carvalho, F. A. T, Costa, I. G. Predicting gene functions using semi-supervised clustering algorithms with objective function optimization, Proc. of the Brazilian Symposium on Neural Networks, IEEE, 61-66, 2012.


Redestig, H., Costa, I. G. Detection and interpretation of metabolite-transcript co-responses using combined profiling data. Bioinformatics (Proceedings of ISMB 2011), v. 27(13), p. i357-i365, 2011. [paper].[Supp]

Hafemeister, C., Costa, I. G., Schonhuth, A., Schliep, A.Classifying short gene expression time-courses with Bayesian estimation of piecewise constant functions. Bioinformatics, v. 27(7),p. 946-952, 2011. [paper] [Supp]

Costa, I.G., Roider, H, do Rego, T. G., de Carvalho, F. A. T. Predicting Gene Expression in T cell Differentiation from Histone Modifications and Transcription Factor Binding Affinities by Linear Mixture Models, BMC Bioinformatics. , v.12, p.S29, 2011.[paper] [Supp]

Schilling, R , Costa, I. G. ,  Schliep, A. . pGQL: A Probabilistic Graphical Query Language for Gene Expression Time Courses. Biodata Mining, v. 4, p. 9, 2011. [paper] [software]


B. Georgi, I. G. Costa, A. Schliep PyMix – The Python mixture package – a tool for clustering of heterogeneous biological data, BMC Bioinformatics v.11, p.9, 2010, 11:9 [paper].[Supp]

de Souto, M. C. P., Lorena, A. C., Spoloir, N., Costa , I. G. Complexity measures of supervised classifications tasks: A case study for cancer gene expression data , In: Proceedings of the International Joint Conference on Neural Networks (IJCNN). Los Alamitos: IEEE, 2010. p.1-7.

Lorena, A. C., Spoloir, N., Costa , I. G., de Souto, M. C. P. On the Complexity of Gene Marker Selection ,. Proc. of the Brazilian Symposium on Neural Networks. Los Alamitos, EUA: IEEE, 2010.

Ribeiro, C., Costa , I. G., de Carvalho, F. A. T Semi-supervised Approach for Finding Cancer Sub-Classes on Gene Expression Data, Advances in Computational Biology, Lecture Notes in Bioinformatics. Berlin: Springer Verlag, 2010. v.6268. p.24 – 34


Costa I.G., Schoenhut, A., Hafemeister. C, and Schliep, A., Constrained Mixture Estimation for Analysis and Robust Classification of Clinical Time Series. Bioinformatics, v. 25, p. i6-i14, 2009. [paper] [Supp]

I. G. Costa, A. C. Lorena, L. R. M. P. y Peres, M. C. P de Souto, Using Supervised Complexity Measures in the Analysis of Cancer Gene Expression Data sets. In: Lecture Notes on Bionformatics. Berlin : Springer Verlag, 2009 (Best Paper Award).

A.C.A. Nascimento, R.B.C. Prudencio, M.C.P. de Souto, I. G. Costa, Mining rules for selection of clustering methods for cancer gene expression. In: International Conference on Artificial Neural Networks, 2009, Limassol, Cyprus. Proc. of the International Conference on Artificial Neural Networks. Berlin : Springer-Verlag, 2009.


M. C. P. de Souto, I. G. Costa , D. A. S Araujo, T. B. Ludermir, and A. Schliep Clustering cancer gene expression data: a comparative study. BMC Bioinformatics, 9:497, 2008 [paper][Supp].

I. G. Costa, S. Roepcke, C. Hafemeister, and A. Schliep, Inferring Differentiation Pathways from Gene Expression. Bioinformatics, v 24, p. i156-i164, 2008, [paper] [Supp].

de Souto, M. C. P. ; Costa , I. G. ; Lorena, A. C. On the complexity of gene expression classification data sets.  Eighth International Conference on Hybrid Intelligent Systems. Los Alamitos : IEEE, 2008.

M. C. P. de Souto, D. A. S. Araujo, I. G. Costa , R. G. F. Soares, T. B. Ludermir, and A. Schliep Comparative Study on Normalization Procedures for Cluster Analysis of Gene Expression Datasets. Proceedings of the International Joint Conference on Neural Networks, IEEE Computer Society, pages, 3728-3734, 2008 [pre-print].

M. C. P. de Souto, R. B. C. Prudencio, R. G. F. Soares, D. A. S. Araujo, I. G. Costa , T. B. Ludermir, and A. Schliep Ranking and Selecting Clustering Algorithms Using a Meta-learning Approach. Proceedings of the International Joint Conference on Neural Networks, IEEE Computer Society, 2008 [pre-print].


Costa , I. G., Roepcke, S, Schliep, A. Gene Expression Trees in Lymphoid Development . BMC Immunology, v. 8, p. 25, 2007, [paper][Supp].

I. G. Costa, R. Krause, L. Optiz, and A. Schliep Semi-supervised learning for the identification of syn-expressed genes from fused microarray and in situ image data. BMC Bioinformatics 2007, Vol. 8, Pages S3. [paper] [Supp].

I. G. Costa, M. C. P. de Souto, and A. Schliep Validating Gene Clusterings by Selecting Informative Gene Ontology Terms with Mutual Information. Advances in Bioinformatics and Computational Biology, Lecture Notes in Bioinformatics, Pages 81-92, Springer Verlag, 2007 [pre-print].


I. G. Costa, A. Schliep On the Feasibility ofHeterogeneous Analysis of Large Scale Biological Data Proceedings of ECML/PKDD 2006 Workshop on Data and Text Mining for Integrative Biology, ac55-60. [preprint]

A. A. Schönhuth., I. G. Costa, A. Schliep Semi-supervised Clustering of Yeast Gene Expression Data Japanese-German Workshop on data analysis. and classification 2006, Springer,, [pre-print].


A. Schliep, I. G. Costa, C. Steinhoff, A. A. Schönhuth. Analyzing Gene Expression Time-Courses IEEE/ACM Transactions on Computational Biology andBioinformatics ( Special Issue on Machine Learning for Bioinformatics), 2005, 2(3):179-193. [paper]

I. G. Costa, A. Schönhuth, A. Schliep. The Graphical Query Language: a tool for analysis of gene expression time-courses , Bioinformatics, 2005, 21(10):2544-2545. [paper][software]

I. G. Costa, A. Schliep On external indices for mixtures: validating mixtures of genes In From Data and Information Analysis to Knowledge Engineering, Springer 2005, 662-669.


I. G. Costa, F. A. T. de Carvalho, M. C. P. de Souto, Comparative Analysis of Clustering Methods for Gene Expression Time Course Data, Genetics and Molecular Biology, 2004, 27, 4:623-631. [paper]


I. G. Costa, F. A. T. De Carvalho, M. C. P. de Souto Comparative Study on Proximity Indices for Cluster Analysis of Gene Expression Time Series. Journal of Intelligent and Fuzzy Systems, 13(2-4), 133-142, 2003.


Costa, I. G., M. C. P. de Souto, F. A. T. De Carvalho,. A Symbolic Approach to Gene Expression Time Series Analysis VII Brazilian Symposium on Neural Networks. Los Alamitos: IEEE Computer Society Press, 2002, 1,24-30.

I. G. Costa, F. A. T. de Carvalho, M.C.P. de Souto Stability Evaluation of Clustering Algorithms for Time Series Gene Expression Data In: I Brazilian Workshop on Bioinformatics, 2002, 88-90.