• Disease-Gene Association Using Genetic Programming

      Entezari Heravi, Ashkan; Department of Computer Science
      As a result of mutation in genes, which is a simple change in our DNA, we will have undesirable phenotypes which are known as genetic diseases or disorders. These small changes, which happen frequently, can have extreme results. Understanding and identifying these changes and associating these mutated genes with genetic diseases can play an important role in our health, by making us able to find better diagnosis and therapeutic strategies for these genetic diseases. As a result of years of experiments, there is a vast amount of data regarding human genome and different genetic diseases that they still need to be processed properly to extract useful information. This work is an effort to analyze some useful datasets and to apply different techniques to associate genes with genetic diseases. Two genetic diseases were studied here: Parkinson’s disease and breast cancer. Using genetic programming, we analyzed the complex network around known disease genes of the aforementioned diseases, and based on that we generated a ranking for genes, based on their relevance to these diseases. In order to generate these rankings, centrality measures of all nodes in the complex network surrounding the known disease genes of the given genetic disease were calculated. Using genetic programming, all the nodes were assigned scores based on the similarity of their centrality measures to those of the known disease genes. Obtained results showed that this method is successful at finding these patterns in centrality measures and the highly ranked genes are worthy as good candidate disease genes for being studied. Using standard benchmark tests, we tested our approach against ENDEAVOUR and CIPHER - two well known disease gene ranking frameworks - and we obtained comparable results.
    • Effect of the Side Effect Machines in Edit Metric Decoding

      Banik, Sharnendu; Department of Computer Science
      The development of general edit metric decoders is a challenging problem, especially with the inclusion of additional biological restrictions that can occur in DNA error correcting codes. Side effect machines (SEMs), an extension of finite state machines, can provide efficient decoding algorithms for such edit metric codes. However, finding a good machine poses its own set of challenges and is itself considered as an open problem with no general solution. Previous studies utilizing evolutionary computation techniques, such as genetic algorithms and evolutionary programming to search for good SEMs have found success in terms of decoding accuracy. However, they all worked with extremely constricted problem spaces i.e. a single code or codes of the same length. Therefore a general approach that works well across codes of different lengths is yet to be formalized. In this research, several codes of varying lengths are used to study the effectiveness of evolutionary programming (EP) as a general approach for finding efficient edit metric decoders. Two classification methods—direct and fuzzy—are compared while also changing some of the EP settings to observe how the decoding accuracy is affected. The final SEMs are verified against an additional dataset to test their general effectiveness. Regardless of the code length, the best results are found using the fuzzy classification methods. For codes of length 10, a maximum accuracy of up to 99.4% is achieved for distance 1 whereas distance 2 and 3 achieve up to 97.1% and 85.9%, respectively. Unsurprisingly, the accuracy suffers for longer codes, as the maximum accuracies achieved by codes of length 14 were 92.4%, 85.7% and 69.2% for distance 1, 2, and 3 respectively. Additionally, the machines are examined for potential bloat by comparing the number of visited states against the number of total states. The study has found some machines with at least one unvisited state. The bloat is seen more in larger machines than it is in smaller machines. Furthermore, the results are analyzed to find potential trends and relationships among the parameters. The trend that is most consistently noticed is that — when allowed, the longer codes generally show a propensity for larger machines.
    • Elliptic Curve Cryptography using Computational Intelligence

      Ribaric, Tim; Department of Computer Science
      Public-key cryptography is a fundamental component of modern electronic communication that can be constructed with many different mathematical processes. Presently, cryptosystems based on elliptic curves are becoming popular due to strong cryptographic strength per small key size. At the heart of these schemes is the complexity of the elliptic curve discrete logarithm problem (ECDLP). Pollard’s Rho algorithm is a well known method for solving the ECDLP and thereby breaking ciphers based on elliptic curves for reasonably small key sizes (up to approximately 100 bits in length). It has the same time complexity as other known methods but is advantageous due to smaller memory requirements. This study considers how to speed up the Rho process by modifying a key component: the iterating function, which is the part of the algorithm responsible for determining what point is considered next when looking for the solution to the ECDLP. It is replaced with an alternative that is found through an evolutionary process. This alternative consistently and significantly decreases the number of iterations required by Pollard’s Rho Algorithm to successfully find the sought after solution.
    • Enabling and Measuring Complexity in Evolving Designs using Generative Representations for Artificial Architecture

      Harrington, Adrian; Department of Computer Science (Brock University, 2012-11-07)
      As the complexity of evolutionary design problems grow, so too must the quality of solutions scale to that complexity. In this research, we develop a genetic programming system with individuals encoded as tree-based generative representations to address scalability. This system is capable of multi-objective evaluation using a ranked sum scoring strategy. We examine Hornby's features and measures of modularity, reuse and hierarchy in evolutionary design problems. Experiments are carried out, using the system to generate three-dimensional forms, and analyses of feature characteristics such as modularity, reuse and hierarchy were performed. This work expands on that of Hornby's, by examining a new and more difficult problem domain. The results from these experiments show that individuals encoded with those three features performed best overall. It is also seen, that the measures of complexity conform to the results of Hornby. Moving forward with only this best performing encoding, the system was applied to the generation of three-dimensional external building architecture. One objective considered was passive solar performance, in which the system was challenged with generating forms that optimize exposure to the Sun. The results from these and other experiments satisfied the requirements. The system was shown to scale well to the architectural problems studied.
    • Epidemic Simulation and Mitigation via Evolutionary Computation

      Dubé, Michael; Department of Computer Science
      A global pandemic remains a public health event that presents a unique and unpredictable challenge for those making health related decisions and the populations who experience the virus. Though a pandemic also provides the opportunity for researchers and health administrations around the world to mobilize in the fields of epidemiology, computer science, and mathematics to generate epidemic models, vaccines, and vaccination strategies to mitigate unfavourable outcomes. To this end, a generative representation to create personal contact networks, representing the social connections within a population, known as the Local THADS-N generative representation is introduced and expanded upon. This representation uses an evolutionary algorithm and is modified to include new local edge operations improving the performance of the system across several test problems. These problems include an epidemic's duration, spread through a population, and closeness to past epidemic behaviour. The system is further developed to represent sub-communities known as districts, better articulating epidemics spreading within and between neighbourhoods. In addition, the representation is used to simulate four competing vaccination strategies in preparation for iterative vaccine deployment amongst a population, an inevitability when considering the lag inherent to developing vaccines. Finally, the Susceptible-Infected-Removed (SIR) model of infection used by the system is expanded in preparation for adding an asymptomatic state of infection as seen within the COVID-19 pandemic.
    • Equational Reasoning about Object-Oriented Programs

      Hossain, Md. Nour; Department of Computer Science (2013-04-08)
      Formal verification of software can be an enormous task. This fact brought some software engineers to claim that formal verification is not feasible in practice. One possible method of supporting the verification process is a programming language that provides powerful abstraction mechanisms combined with intensive reuse of code. In this thesis we present a strongly typed functional object-oriented programming language. This language features type operators of arbitrary kind corresponding to so-called type protocols. Sub classing and inheritance is based on higher-order matching, i.e., utilizes type protocols as basic tool for reuse of code. We define the operational and axiomatic semantics of this language formally. The latter is the basis of the interactive proof assistant VOOP (Verified Object-Oriented Programs) that allows the user to prove equational properties of programs interactively.
    • Evolution of architectural floor plans

      Flack, Robert W. J.; Department of Computer Science (Brock University, 2011-10-13)
      Layout planning is a process of sizing and placing rooms (e.g. in a house) while a t t empt ing to optimize various criteria. Often the r e are conflicting c r i t e r i a such as construction cost, minimizing the distance between r e l a t ed activities, and meeting the area requirements for these activities. The process of layout planning ha s mostly been done by hand, wi th a handful of a t t empt s to automa t e the process. Thi s thesis explores some of these pa s t a t t empt s and describes several new techniques for automa t ing the layout planning process using evolutionary computation. These techniques a r e inspired by the existing methods, while adding some of the i r own innovations. Additional experimenLs are done to t e s t the possibility of allowing polygonal exteriors wi th rectilinear interior walls. Several multi-objective approaches are used to evaluate and compare fitness. The evolutionary r epr e s ent a t ion and requirements specification used provide great flexibility in problem scope and depth and is worthy of considering in future layout and design a t t empt s . The system outlined in thi s thesis is capable of evolving a variety of floor plans conforming to functional and geometric specifications. Many of the resulting plans look reasonable even when compared to a professional floor plan. Additionally polygonal and multi-floor buildings were also generated.
    • Evolutionary synthesis of stochastic gene network models using feature-based search spaces

      Imada, Janine.; Department of Computer Science (Brock University, 2009-01-28)
      A feature-based fitness function is applied in a genetic programming system to synthesize stochastic gene regulatory network models whose behaviour is defined by a time course of protein expression levels. Typically, when targeting time series data, the fitness function is based on a sum-of-errors involving the values of the fluctuating signal. While this approach is successful in many instances, its performance can deteriorate in the presence of noise. This thesis explores a fitness measure determined from a set of statistical features characterizing the time series' sequence of values, rather than the actual values themselves. Through a series of experiments involving symbolic regression with added noise and gene regulatory network models based on the stochastic 'if-calculus, it is shown to successfully target oscillating and non-oscillating signals. This practical and versatile fitness function offers an alternate approach, worthy of consideration for use in algorithms that evaluate noisy or stochastic behaviour.
    • Evolving Passive Solar Buildings Using Multi-Behavioural Diversity Search Strategies

      Salma, Umme; Department of Computer Science
      To build a green environment and to plan a sustainable urban area, energy efficient building design plays a major role. Energy efficient measures for building design include heating, cooling, and ventilating, as well as construction materials cost. In passive solar building design, sunlight exposure is used to heat the building in winter and reject heat in summer to keep the building cool. The goals of the passive solar building design are to minimize the energy cost and devices used for heating or cooling. The major goal of this research is to increase the diversity of solutions evolved with an evolutionary system for green building design. An existing genetic programming system for building design is enhanced with a search paradigm called novelty search, which uses measured aspects of designs in an attempt to promote more diverse or novel solutions. Instead of optimizing an objective, novelty search measures behaviors to obtain diverse solutions. We combine novelty search and fitness scores using a many objective strategy called sum of ranks. The simulation software EnergyPlus is used to evaluate the building design and energy costs. An existing fitness-based genetic programming system is enhanced with novelty search. We compare vanilla genetic programming solutions with our novelty-driven solutions. Experimental results show that genetic program solutions are more fit, but novelty strategies create more diverse solutions. For example, novelty search solutions, use a much more diverse selection of building materials.
    • Extending relAPS to first order logic

      Aameri, Bahar; Department of Computer Science (Brock University, 2011-03-08)
      RelAPS is an interactive system assisting in proving relation-algebraic theorems. The aim of the system is to provide an environment where a user can perform a relation-algebraic proof similar to doing it using pencil and paper. The previous version of RelAPS accepts only Horn-formulas. To extend the system to first order logic, we have defined and implemented a new language based on theory of allegories as well as a new calculus. The language has two different kinds of terms; object terms and relational terms, where object terms are built from object constant symbols and object variables, and relational terms from typed relational constant symbols, typed relational variables, typed operation symbols and the regular operations available in any allegory. The calculus is a mixture of natural deduction and the sequent calculus. It is formulated in a sequent style but with exactly one formula on the right-hand side. We have shown soundness and completeness of this new logic which verifies that the underlying proof system of RelAPS is working correctly.
    • Efficient Merging and Decomposition Variants of Cooperative Particle Swarm Optimization for Large Scale Problems

      Douglas, Jay; Department of Computer Science
      For large-scale optimization problems (LSOPs), an increased problem size reduces performance by both increasing the landscape complexity, as well as exponentially increasing the search space size. These contributing factors make up the "curse of dimensionality", which is addressed either by improving the search operator of the meta-heuristic or decomposing the high-dimensional problem into smaller sub-problems. Unfortunately, non-separable LSOPs contain a scaling number of variable dependencies which should be optimized together but are often separated into different sub-problems due to insufficient grouping strategies. Various particle swarm optimization (PSO) techniques have been proposed in order to address these LSOPs, either through the improvement of search operators or utilizing decomposition. However, there is a lack of comparison between them showing which PSO variant performs best for specific types of LSOPs. Additionally, decomposition variants which utilize a cooperative PSO (CPSO) approach still struggle to properly group related variables in more difficult non-separable multimodal problems. In an attempt to better optimize these non-separable LSOPs, this thesis introduces two new adaptive decomposition and merging CPSO algorithms, referred to as DCPSO2 and MCPSO2 respectively, which offer a new regrouping strategy by adaptively splitting and merging stagnated sub-swarms according to the their fitness. These algorithms proposed in this thesis are then compared against existing CPSO variants in order to establish the best decomposition-based PSO algorithm for LSOPs. Results show that the decomposition and merging variants are able to perform competitively with previously well-established CPSO algorithms for large-scale problems across all problem classes. DCPSO ranks the highest in terms of accuracy across all non-separable problems while MCPSO and MCPSO2 prove to have the fastest convergence amongst all algorithms.
    • Feature Selection and Classification Using Age Layered Population Structure Genetic Programming

      Awuley, Anthony; Department of Computer Science
      The curse of dimensionality is a major problem in the fields of machine learning, data mining and knowledge discovery. Exhaustive search for the most optimal subset of relevant features from a high dimensional dataset is NP hard. Sub–optimal population based stochastic algorithms such as GP and GA are good choices for searching through large search spaces, and are usually more feasible than exhaustive and determinis- tic search algorithms. On the other hand, population based stochastic algorithms often suffer from premature convergence on mediocre sub–optimal solutions. The Age Layered Population Structure (ALPS) is a novel meta–heuristic for overcoming the problem of premature convergence in evolutionary algorithms, and for improving search in the fitness landscape. The ALPS paradigm uses an age–measure to control breeding and competition between individuals in the population. This thesis uses a modification of the ALPS GP strategy called Feature Selection ALPS (FSALPS) for feature subset selection and classification of varied supervised learning tasks. FSALPS uses a novel frequency count system to rank features in the GP population based on evolved feature frequencies. The ranked features are translated into probabilities, which are used to control evolutionary processes such as terminal–symbol selection for the construction of GP trees/sub-trees. The FSALPS meta–heuristic continuously refines the feature subset selection process whiles simultaneously evolving efficient classifiers through a non–converging evolutionary process that favors selection of features with high discrimination of class labels. We investigated and compared the performance of canonical GP, ALPS and FSALPS on high–dimensional benchmark classification datasets, including a hyperspectral image. Using Tukey’s HSD ANOVA test at a 95% confidence interval, ALPS and FSALPS dominated canonical GP in evolving smaller but efficient trees with less bloat expressions. FSALPS significantly outperformed canonical GP and ALPS and some reported feature selection strategies in related literature on dimensionality reduction.
    • Formalizing affordances in situation

      Lenarcic, Adam; Department of Computer Science (2012-04-03)
      The representation of a perceptual scene by a computer is usually limited to numbers representing dimensions and colours. The theory of affordances attempted to provide a new way of representing an environment, with respect to a particular agent. The view was introduced as part of an entire field of psychology labeled as 'ecological,' which has since branched into computer science through the field of robotics, and formal methods. This thesis will describe the concept of affordances, review several existing formalizations, and take a brief look at applications to robotics. The formalizations put forth in the last 20 years have no agreed upon structure, only that both the agent and the environment must be taken in relation to one another. Situation theory has also been evolving since its inception in 1983 by Barwise & Perry. The theory provided a formal way to represent any arbitrary piece of information in terms of relations. This thesis will take a toy version of situation theory published in CSLI lecture notes no. 22, and add to the given ontologies. This thesis extends the given ontologies to include specialized affordance types, and individual object types. This allows for the definition of semantic objects called environments, which support a situation and a set of affordances, and niches which refer to a set of actions for an individual. Finally, a possible way for an environment to change into a new environment is suggested via the activation of an affordance.
    • A Functional Programming Language with Patterns and Copatterns

      Alkhulaif, Shams A.; Department of Computer Science
      Since the emergence of coinductive data types in functional programming languages, various languages such as Haskell and Coq tried different ways in dealing with them. Yet, none of them dealt with coinductive data types properly. In lazy languages such as Haskell, both inductive data types and coinductive data types are gathered and mixed in one list. Moreover, some languages such as Coq used the same constructors that are used for inductive data types as a tool to tackle coinductive data types, and while other languages such as Haskell did use destructors, they did not use them properly. Coinductive data types behave differently than inductive data types and therefore, it is more appropriate to deal with them differently. In this thesis, we propose a new functional programming language where coinductive data types are dealt with in a dual approach in which coinductive data types are defined by observation and inductive data types are defined by constructors. This approach is more appropriate in dealing with coinductive data types whose importance comes from their role in creating a safer and more sophisticated software.
    • GA approach for finding Rough Set decision rules based on bireducts

      Rybkin, Oleg; Department of Computer Science
      Feature selection plays an important role in knowledge discovery and data mining nowadays. In traditional rough set theory, feature selection using reduct - the minimal discerning set of attributes - is an important area. Nevertheless, the original definition of a reduct is restrictive, so in one of the previous research it was proposed to take into account not only the horizontal reduction of information by feature selection, but also a vertical reduction considering suitable subsets of the original set of objects. Following the work mentioned above, a new approach to generate bireducts using a multi--objective genetic algorithm was proposed. Although the genetic algorithms were used to calculate reduct in some previous works, we did not find any work where genetic algorithms were adopted to calculate bireducts. Compared to the works done before in this area, the proposed method has less randomness in generating bireducts. The genetic algorithm system estimated a quality of each bireduct by values of two objective functions as evolution progresses, so consequently a set of bireducts with optimized values of these objectives was obtained. Different fitness evaluation methods and genetic operators, such as crossover and mutation, were applied and the prediction accuracies were compared. Five datasets were used to test the proposed method and two datasets were used to perform a comparison study. Statistical analysis using the one-way ANOVA test was performed to determine the significant difference between the results. The experiment showed that the proposed method was able to reduce the number of bireducts necessary in order to receive a good prediction accuracy. Also, the influence of different genetic operators and fitness evaluation strategies on the prediction accuracy was analyzed. It was shown that the prediction accuracies of the proposed method are comparable with the best results in machine learning literature, and some of them outperformed it.
    • Game Theory-based Allocation Management in VCC Networks

      Tejani, Binal; Department of Computer Science
      Vehicular Ad-hoc Networks (VANETs) have contributed significantly towards improving road traffic management and safety. VANETs, integrated with Vehicular Clouds, enable underutilized vehicular resources for efficient resource management, fulfilling service requests. However, due to the frequently changing network topology of vehicular cloud networks, the vehicles frequently move out of the coverage area of roadside units (RSUs), disconnecting from the RSUs and interrupting the fulfillment of ongoing service requests. In addition, working with heterogeneous vehicles makes it difficult to match the service requests with the varying resources of individual vehicles. Therefore, to address these challenges, this work introduces the concept of clustering resources from nearby vehicles to form Combined Resource Units (CRUs). These units contribute to maximizing the rate of fulfillment of service requests. CRU composition is helpful, especially for the heterogeneity of vehicles, since it allows clustering the varying resources of vehicles into a single unit. The vehicle resources are clustered into CRUs based on three different sized pools, making the service matching process more time-efficient. Previous works have adopted stochastic models for resource clustering configurations. However, this work adopts distinct search algorithms for CRU composition, which are computationally less complex. Results showed that light-weight search algorithms, such as selective search algorithm (SSA), achieved close to 80% of resource availability without over-assembling CRUs in higher density scenarios. Following CRU composition, a game-theoretical approach is opted for allocating CRUs to service requests. Under this approach, the CRUs play a non-cooperative game to maximize their utility, contributing to factors such as fairness, efficiency, improved system performance and reduced system overhead. The utility value takes into account the RSS (Received Signal Strength) value of each CRU and the resources required in fulfilling a request. Results of the game model showed that the proposed approach of CRU composition obtained 90% success rate towards matching and fulfilling service requests.
    • Generating Aesthetically Pleasing Images in a Virtual Environment using Particle Swarm Optimization

      Barry, William; Department of Computer Science (Brock University, 2012-11-07)
      This research focuses on generating aesthetically pleasing images in virtual environments using the particle swarm optimization (PSO) algorithm. The PSO is a stochastic population based search algorithm that is inspired by the flocking behavior of birds. In this research, we implement swarms of cameras flying through a virtual world in search of an image that is aesthetically pleasing. Virtual world exploration using particle swarm optimization is considered to be a new research area and is of interest to both the scientific and artistic communities. Aesthetic rules such as rule of thirds, subject matter, colour similarity and horizon line are all analyzed together as a multi-objective problem to analyze and solve with rendered images. A new multi-objective PSO algorithm, the sum of ranks PSO, is introduced. It is empirically compared to other single-objective and multi-objective swarm algorithms. An advantage of the sum of ranks PSO is that it is useful for solving high-dimensional problems within the context of this research. Throughout many experiments, we show that our approach is capable of automatically producing images satisfying a variety of supplied aesthetic criteria.
    • Generating finite integral relation algebras

      Zhang, Si; Department of Computer Science (Brock University, 2011-03-08)
      Relation algebras and categories of relations in particular have proven to be extremely useful as a fundamental tool in mathematics and computer science. Since relation algebras are Boolean algebras with some well-behaved operations, every such algebra provides an atom structure, i.e., a relational structure on its set of atoms. In the case of complete and atomic structure (e.g. finite algebras), the original algebra can be recovered from its atom structure by using the complex algebra construction. This gives a representation of relation algebras as the complex algebra of a certain relational structure. This property is of particular interest because storing the atom structure requires less space than the entire algebra. In this thesis I want to introduce and implement three structures representing atom structures of integral heterogeneous relation algebras, i.e., categorical versions of relation algebras. The first structure will simply embed a homogeneous atom structure of a relation algebra into the heterogeneous context. The second structure is obtained by splitting all symmetric idempotent relations. This new algebra is in almost all cases an heterogeneous structure having more objects than the original one. Finally, I will define two different union operations to combine two algebras into a single one.
    • Generating Relation Algebras for Qualitative Spatial Reasoning

      Siddavaatam, Prathap; Department of Computer Science (2013-04-15)
      Basic relationships between certain regions of space are formulated in natural language in everyday situations. For example, a customer specifies the outline of his future home to the architect by indicating which rooms should be close to each other. Qualitative spatial reasoning as an area of artificial intelligence tries to develop a theory of space based on similar notions. In formal ontology and in ontological computer science, mereotopology is a first-order theory, embodying mereological and topological concepts, of the relations among wholes, parts, parts of parts, and the boundaries between parts. We shall introduce abstract relation algebras and present their structural properties as well as their connection to algebras of binary relations. This will be followed by details of the expressiveness of algebras of relations for region based models. Mereotopology has been the main basis for most region based theories of space. Since its earliest inception many theories have been proposed for mereotopology in artificial intelligence among which Region Connection Calculus is most prominent. The expressiveness of the region connection calculus in relational logic is far greater than its original eight base relations might suggest. In the thesis we formulate ways to automatically generate representable relation algebras using spatial data based on region connection calculus. The generation of new algebras is a two pronged approach involving splitting of existing relations to form new algebras and refinement of such newly generated algebras. We present an implementation of a system for automating aforementioned steps and provide an effective and convenient interface to define new spatial relations and generate representable relational algebras.
    • Generator Matrix Based Search for Extremal Self-Dual Binary Error-Correcting Codes

      Derka, Martin; Department of Computer Science (Brock University, 2012-09-18)
      Self-dual doubly even linear binary error-correcting codes, often referred to as Type II codes, are codes closely related to many combinatorial structures such as 5-designs. Extremal codes are codes that have the largest possible minimum distance for a given length and dimension. The existence of an extremal (72,36,16) Type II code is still open. Previous results show that the automorphism group of a putative code C with the aforementioned properties has order 5 or dividing 24. In this work, we present a method and the results of an exhaustive search showing that such a code C cannot admit an automorphism group Z6. In addition, we present so far unpublished construction of the extended Golay code by P. Becker. We generalize the notion and provide example of another Type II code that can be obtained in this fashion. Consequently, we relate Becker's construction to the construction of binary Type II codes from codes over GF(2^r) via the Gray map.