Major reasons for the emergence of biological network analysis [1–4] are the extensive use of computer systems during the last decade and the availability of highly demanding and complex biological data sets. For instance, important types of such biological networks are protein-protein interaction networks [5–7], transcriptional regulatory networks [8, 9], and metabolic networks [7, 10, 11]. Note that vertices in such biological networks can represent, e.g., proteins, transcription factors or metabolites which are connected by edges representing interactions, concentrations or reactions, respectively [3, 12]. Thus, vertex-and edge-labeled graphs is an important graph class [13, 14] and useful for modeling biological networks [3]. To name only some well-known examples or methods which have often been applied within biological network analysis, we briefly mention graph classes like scale-free and small-world networks [15, 16], network centralities [12, 17], module and motif detection [18–20], and complexity measures for exploring biological networks structurally [21, 22].

Taking into account that a large number of graph-theoretical methods have been developed so far, approaches to process and meaningfully analyze labeled graphs are clearly underrepresented in the scientific literature. In particular, this holds for chemical graph analysis where various graph-theoretical methods and topological indices have been intensely used, see, e.g., [23–34]. Yet, we state a few examples where such graphs appear in the context of biological network analysis: Structure descriptors to determine the complexity of pathways representing labeled graphs have been used to examine the relationship between metabolic and phylogenetic information, see [22]. Another challenging task relates to determine the similarity between graphs or subgraphs [35–38]. For instance, YANG et al. [38] recently developed path-and graph matching methods involving vertex-and edge-labeled graphs which turned out to be useful for biological network comparison [38]. Finally, to utilize graph-theoretical concepts for investigating graphs and labeled graphs within molecular biology, HUBER et al. [39] reviewed several existing software packages and outlined concrete applications [39].

In this paper, we restrict our analysis to a set of bio-chemical graphs which have already been used for predicting Ames mutagenicity, see [40]. To perform this study, we develop and investigate entropic descriptors for vertex- and edge-labeled graphs. Before sketching the main contributions of our paper, we state some facts about topological descriptors which have been used in mathematical chemistry, drug design, and QSPR/QSAR.

As already mentioned, topological indices have been proven to be powerful tools in drug design, chemometrics, bioinformatics, and mathematical and medicinal chemistry [23, 24, 26, 28, 29, 34, 41–43]. Certainly, one reason for their success can be understood by the fact that there is a strong need to apply empirical models to solve QSPR (Quantitative structure-property relationship)/QSAR (Quantitative structure-activity relationship) problems [24, 28, 29, 44] and related tasks in the just mentioned areas. In this paper, we put the emphasis on developing novel molecular descriptors for tackling a problem in QSAR: We will use structural property descriptors of molecules based on SHANNON' s entropy for predicting Ames mutagenicity, see [40, 45–47]. Generally, we note that the problem of detecting mutagenicity in vitro is based on the bacterial reverse mutation assay (Ames test) and often serves as a crucial tool in drug design and discovery [40, 45–47].

Further, topological descriptors have often been combined with other techniques from statistical data analysis, e.g., clustering methods [26, 48] to infer correlations between the used indices. Besides using topological descriptors for characterizing chemical graphs [27, 32, 49], they have also been applied to quantify the structural similarity of chemicals representing networks [50, 51]. Among the large number of existing topological indices, an important class of such measures relies on SHANNON's entropy to characterize graphs by determining their structural information content [27, 52–54]. Until now, especially these measures have been intensely applied within biology, ecology, and mathematical chemistry [27, 52, 54–60], in particular, to measure the complexity of biological and chemical systems [27, 52, 61]. Recently, we already developed a novel procedure to infer such information-theoretic measures for graphs that results in so-called partition-independent measures [57, 62]. More precisely, we mean that we do not induce partitions using the procedure manifested by Equation (2), (3) in [57]. In this work, partitions using graph invariants and equivalence criteria have been explicitly induced, see, e.g., [27, 52, 53]. Note that we already placed a comment on this problem in the first paragraph of the section 'Partition-Independent Information Measures for Graphs'. In contrast to partition-independent measures, classical partition-based information measures often rely on the problem to group elements manifested by an arbitrary graph invariant according to an equivalence criterion [27, 53, 54, 63].

**The contribution of our paper is twofold:** First, we develop some novel information-theoretic descriptors having the ability to incorporate vertex- and edge-labels when measuring the information content of a chemical structure. Because we already mentioned that there is a lack of graph measures which can process vertex-and edge-labeled graphs meaningfully, such descriptors need to be further developed. In terms of analyzing chemical structures, that means they can only be adequately represented by graphs if different types of atoms (vertices) and different types of bonds (edges) are considered. Hence, there is a strong need to exploring such labeled networks. Besides developing the novel information-theoretic measures for vertex- and edge-labeled graphs, we will investigate some of their properties thereof (see section 'Properties of the Novel Information-Theoretic Descriptors') [40, 47]. Second, the paper also deals with evaluating the ability of the mentioned descriptors to predict Ames mutagenicity when applying well-known machine learning methods like random forests [64, 65] (RF) and support vector machines [64, 66] (SVM). Starting from chemical structures represented as vectors composed of topological descriptors, we will analyze the prediction performance by focussing on the underlying supervised graph classification problem. We want to emphasize that beside our novel descriptors, we also combine them with other well known information-theoretic and non-information-theoretic measures which turned out to be useful in QSPR/QSAR, see, e.g., [29]. Further, we examine the influence on the prediction performance when taking semantical (labels) and structural information of the graphs into account. Finally, we want to point out that considerable related work has been done so far that deals with investigating multifaceted problems when applying molecular descriptors to machine learning algorithms [67–69]. For example, DESHPANDE et al. [67] developed an approach to find discriminating substructures of chemical graphs. Then, by using a vector representation model for these graphs, they applied several machine learning methods to chemical databases for classifying these structures meaningfully. Another interesting study was done by XUE et al. [68] that deals with applying a variety of molecular descriptors to characterize structural and physicochemical properties of molecules [68]. Particularly, they used a feature selection method for automatically selecting molecular descriptors for SVM-prediction of P-glycoprotein substrates and others. As an important result, XUE et al. [68] determined the reduction of noise and its influence on the prediction accuracy of a statistical learning system [70]. The last contribution we want to sketch in brief is due to MAHÉ et al. [69]. In this work, a graph kernel approach [64, 69] was validated for structure-activity-relationship analysis where special kernels based on random walks were used and optimized. Note that more related work can be found in [40, 71–74].