Symbolic and statistical approaches to learning are quite complementary in terms of advantages and drawbacks. The first one typically operates in a logical setting and in its simplest formulation aims at inducing an hypothesis which covers all positive examples and no negative ones given the available knowledge. It allows to incorporate available background knowledge in a natural and sistematic way, and outputs a well interpretable solution. However, learning algorithms typically imply a complex search into the discrete space of hypotheses, are mainly focused on binary classification, extensions requiring ad-hoc solutions, and lack a sound generalization theory. Statistical approaches address many of these disadvantages. Kernel machines, in particular, provide a uniform view of learning tasks as different as classification, regression, clustering and novelty detection thanks to the kernel which abstracts away the input representation. They are efficient to compute once an efficient procedure to compute the kernel is available, as the learning algorithm is cast into a convex optimization problem. Finally, they rely on a sound theoretical foundation based on regularization theory. On the other hand, it is not straightforward to encode complex domain knowledge in such a setting, and the solution produced is not easily interpretable. Our work consisted of developing a number of algorithms aimed at combining the advantages of the two approaches. This seminar will be focused on two such solutions, proof tree kernels and kFOIL. In proof tree kernels, the user defines a program (e.g. a set of logic predicates) to be run on examples, which should explore their structure during the run. A kernel is defined over the trace of such run, thus allowing to define the similarity between examples in terms of the similarity between the traces of the program run over them. In kFOIL, a simple inductive logic programming algorithm (FOIL) incrementally generates clauses which are used as input representation for a kernel machine. The performance of the kernel machine in the feature space defined by the clause set is used as a score to guide the search in the clause space.