• Heuristics for the Critical Node Detection Problem in Large Complex Networks

      Edalatmanesh, Mahmood; Department of Computer Science (Brock University, 2013-09-12)
      Complex networks have recently attracted a significant amount of research attention due to their ability to model real world phenomena. One important problem often encountered is to limit diffusive processes spread over the network, for example mitigating pandemic disease or computer virus spread. A number of problem formulations have been proposed that aim to solve such problems based on desired network characteristics, such as maintaining the largest network component after node removal. The recently formulated critical node detection problem aims to remove a small subset of vertices from the network such that the residual network has minimum pairwise connectivity. Unfortunately, the problem is NP-hard and also the number of constraints is cubic in number of vertices, making very large scale problems impossible to solve with traditional mathematical programming techniques. Even many approximation algorithm strategies such as dynamic programming, evolutionary algorithms, etc. all are unusable for networks that contain thousands to millions of vertices. A computationally efficient and simple approach is required in such circumstances, but none currently exist. In this thesis, such an algorithm is proposed. The methodology is based on a depth-first search traversal of the network, and a specially designed ranking function that considers information local to each vertex. Due to the variety of network structures, a number of characteristics must be taken into consideration and combined into a single rank that measures the utility of removing each vertex. Since removing a vertex in sequential fashion impacts the network structure, an efficient post-processing algorithm is also proposed to quickly re-rank vertices. Experiments on a range of common complex network models with varying number of vertices are considered, in addition to real world networks. The proposed algorithm, DFSH, is shown to be highly competitive and often outperforms existing strategies such as Google PageRank for minimizing pairwise connectivity.
    • A Hybrid Approach to Network Robustness Optimization using Edge Rewiring and Edge Addition

      Paterson, James; Department of Computer Science
      Networks are ubiquitous in the modern world. From computer and telecommunication networks to road networks and power grids, networks make up many crucial pieces of infrastructure that we interact with on a daily basis. These networks can be subjected to damage from many different sources, both random and targeted. If one of these networks receives too much damage, it may be rendered inoperable, which can have disastrous consequences. For this reason, it is in the best interests of those responsible for these networks to ensure that they are highly robust to failure. Since it is not usually feasible to rebuild most existing networks from scratch to make them more resilient, it is necessary to have an approach that can modify an existing network to make it more robust to failure. Previous work has established several methods of accomplishing this task, including edge rewiring and edge addition. Both of these methods can be very useful for optimizing network robustness, but each comes with its own set of limitations. This thesis proposes a new hybrid approach to network robustness optimization that combines both of these approaches. Four edge rewiring based metaheuristic approaches were modified to incorporate one of three different edge addition strategies. A comparative study was performed on these new hybrid optimizers, comparing them to each other and to the vanilla edge rewiring only approach on both synthetic and real world networks. Experiments showed that this new hybrid approach to network robustness optimization leads to much more highly robust networks than an edge rewiring only approach.
    • Hyperspectral Mineral Identification using SVM and SOM

      Iranzad, Arash; Department of Computer Science (Brock University, 2013-10-28)
      Remote sensing techniques involving hyperspectral imagery have applications in a number of sciences that study some aspects of the surface of the planet. The analysis of hyperspectral images is complex because of the large amount of information involved and the noise within that data. Investigating images with regard to identify minerals, rocks, vegetation and other materials is an application of hyperspectral remote sensing in the earth sciences. This thesis evaluates the performance of two classification and clustering techniques on hyperspectral images for mineral identification. Support Vector Machines (SVM) and Self-Organizing Maps (SOM) are applied as classification and clustering techniques, respectively. Principal Component Analysis (PCA) is used to prepare the data to be analyzed. The purpose of using PCA is to reduce the amount of data that needs to be processed by identifying the most important components within the data. A well-studied dataset from Cuprite, Nevada and a dataset of more complex data from Baffin Island were used to assess the performance of these techniques. The main goal of this research study is to evaluate the advantage of training a classifier based on a small amount of data compared to an unsupervised method. Determining the effect of feature extraction on the accuracy of the clustering and classification method is another goal of this research. This thesis concludes that using PCA increases the learning accuracy, and especially so in classification. SVM classifies Cuprite data with a high precision and the SOM challenges SVM on datasets with high level of noise (like Baffin Island).
    • Image Evolution Using 2D Power Spectra

      Gircys, Michael; Department of Computer Science
      Procedurally generated textures have seen use in many applications, are a high-interest topic when exploring evolutionary algorithms, and hold a central interest for digital art. However, there is an existing difficulty in finding suitable heuristics for measuring perceived qualities of an image. Particular difficulty remains for quantifying aspects of style and shape. In an attempt to bridge the divide between computer vision and cognitive perception, one set of proposed measures from previous studies relate to image spatial frequencies. Based on existing research which uses power spectral density of spatial frequencies as an effective metric for image classification and retrieval, we believe this measure and others based on Fourier decomposition may be effective for guiding evolutionary texture synthesis. We briefly compare some alternative means of using frequency analysis to guide evolution of shape and composition, and refine fitness measures based on Fourier analysis and spatial frequency. Our exploration has been conducted with the goals of improving intuition of these measures, evaluating the utility of these measures for image composition, and observing possible adaptations of their use in digital evolutionary art. Multiple evolutionary guidance schemes with consideration of the spatial frequencies' power spectra and phase have been evaluated across numerous targets with mixed results. We will display our exploration of power spectral density measures and their effectiveness as used for evolutionary algorithm fitness targets, particularly for basic compositional guidance in evolutionary art. We also observe and analyze a previously identified phenomenon of spatial properties which could lead to further consideration of visual comfort and aesthetics.
    • An Implementation of Separation Logic in Coq

      Wang, Yi; Department of Computer Science
      For certain applications, the correctness of software involved is crucial, particularly if human life is in danger. In order to achieve correctness, common practice is to gather evidence for program correctness by testing the system. Even though testing may find certain errors in the code, it cannot guarantee that the program is error-free. The program of formal verification is the act of proving or disproving the correctness of the system with respect to a formal specification. A logic for program verification is the so-called Hoare Logic. Hoare Logic can deal with programs that do not utilize pointers, i.e., it allows reasoning about programs that do not use shared mutable data structures. Separation Logic extends Hoare logic that allows pointers, including pointer arithmetic, in the programming language. It has four-pointer manipulating commands which perform the heap operations such as lookup, allocation, deallocation, and mutation. We introduce an implementation of separation logic in the interactive proof system Coq. Besides verifying that separation logic is correct, we will provide several examples of programs and their correctness proof.

      Li, Connor; Department of Computer Science
      Due to the many advances made in designing algorithms, especially the ones used in bioinformatics, it is becoming harder and harder to improve their efficiencies. Therefore, hardware acceleration using General-Purpose computing on Graphics Processing Unit has become a popular choice. BWA-MEM is an important part of the BWA software package for sequence mapping. Because of its high speed and accuracy, we choose to parallelize the popular short DNA sequence mapper. BWA has been a prevalent single node tool in genome alignment, and it has been widely studied for acceleration for a long time since the first version of the BWA package came out. This thesis presents the Big Data GPGPU distributed BWA-MEM, a tool that combines GPGPU acceleration and distributed computing. The four hardware parallelization techniques used are CPU multi-threading, GPU paralleled, CPU distributed, and GPU distributed. The GPGPU distributed software typically outperforms other parallelization versions. The alignment is performed on a distributed network, and each node in the network executes a separate GPGPU paralleled version of the software. We parallelize the chain2aln function in three levels. In Level 1, the function ksw\_extend2, an algorithm based on Smith-Waterman, is parallelized to handle extension on one side of the seed. In Level 2, the function chain2aln is parallelized to handle chain extension, where all seeds within the same chain are extended. In Level 3, part of the function mem\_align1\_core is parallelized for extending multiple chains. Due to the program's complexity, the parallelization work was limited at the GPU version of ksw\_extend2 parallelization Level 3. However, we have successfully combined Spark with BWA-MEM and ksw\_extend2 at parallelization Level 1, which has shown that the proposed framework is possible. The paralleled Level 3 GPU version of ksw\_extend2 demonstrated noticeable speed improvement with the test data set.
    • Improving Short DNA Sequence Alignment with Parallel Computing

      Peters, Darren; Department of Computer Science (2013-04-11)
      Variations in different types of genomes have been found to be responsible for a large degree of physical diversity such as appearance and susceptibility to disease. Identification of genomic variations is difficult and can be facilitated through computational analysis of DNA sequences. Newly available technologies are able to sequence billions of DNA base pairs relatively quickly. These sequences can be used to identify variations within their specific genome but must be mapped to a reference sequence first. In order to align these sequences to a reference sequence, we require mapping algorithms that make use of approximate string matching and string indexing methods. To date, few mapping algorithms have been tailored to handle the massive amounts of output generated by newly available sequencing technologies. In otrder to handle this large amount of data, we modified the popular mapping software BWA to run in parallel using OpenMPI. Parallel BWA matches the efficiency of multithreaded BWA functions while providing efficient parallelism for BWA functions that do not currently support multithreading. Parallel BWA shows significant wall time speedup in comparison to multithreaded BWA on high-performance computing clusters, and will thus facilitate the analysis of genome sequencing data.
    • Improving the Scalability of Reduct Determination in Rough Sets

      Mahmood, Shahid; Department of Computer Science (2013-04-09)
      Rough Set Data Analysis (RSDA) is a non-invasive data analysis approach that solely relies on the data to find patterns and decision rules. Despite its noninvasive approach and ability to generate human readable rules, classical RSDA has not been successfully used in commercial data mining and rule generating engines. The reason is its scalability. Classical RSDA slows down a great deal with the larger data sets and takes much longer times to generate the rules. This research is aimed to address the issue of scalability in rough sets by improving the performance of the attribute reduction step of the classical RSDA - which is the root cause of its slow performance. We propose to move the entire attribute reduction process into the database. We defined a new schema to store the initial data set. We then defined SOL queries on this new schema to find the attribute reducts correctly and faster than the traditional RSDA approach. We tested our technique on two typical data sets and compared our results with the traditional RSDA approach for attribute reduction. In the end we also highlighted some of the issues with our proposed approach which could lead to future research.
    • An Interactive Theorem Prover for First-Order Dynamic Logic

      Das, Tuhin Kanti; Department of Computer Science (Brock University, 2012-10-11)
      Dynamic logic is an extension of modal logic originally intended for reasoning about computer programs. The method of proving correctness of properties of a computer program using the well-known Hoare Logic can be implemented by utilizing the robustness of dynamic logic. For a very broad range of languages and applications in program veri cation, a theorem prover named KIV (Karlsruhe Interactive Veri er) Theorem Prover has already been developed. But a high degree of automation and its complexity make it di cult to use it for educational purposes. My research work is motivated towards the design and implementation of a similar interactive theorem prover with educational use as its main design criteria. As the key purpose of this system is to serve as an educational tool, it is a self-explanatory system that explains every step of creating a derivation, i.e., proving a theorem. This deductive system is implemented in the platform-independent programming language Java. In addition, a very popular combination of a lexical analyzer generator, JFlex, and the parser generator BYacc/J for parsing formulas and programs has been used.
    • Inverse Illumination Design with Genetic Programming

      Moylan, Kelly; Department of Computer Science
      Interior illumination is a complex problem involving numerous interacting factors. This research applies genetic programming towards problems in illumination design. The Radiance system is used for performing accurate illumination simulations. Radiance accounts for a number of important environmental factors, which we exploit during fitness evaluation. Illumination requirements include local illumination intensity from natural and artificial sources, colour, and uniformity. Evolved solutions incorporate design elements such as artificial lights, room materials, windows, and glass properties. A number of case studies are examined, including many-objective problems involving up to 7 illumination requirements, the design of a decorative wall of lights, and the creation of a stained-glass window for a large public space. Our results show the technical and creative possibilities of applying genetic programming to illumination design.
    • L-Fuzzy Relations in Coq

      Jackson, Ethan; Department of Computer Science (Brock University, 2014-09-05)
      Heyting categories, a variant of Dedekind categories, and Arrow categories provide a convenient framework for expressing and reasoning about fuzzy relations and programs based on those methods. In this thesis we present an implementation of Heyting and arrow categories suitable for reasoning and program execution using Coq, an interactive theorem prover based on Higher-Order Logic (HOL) with dependent types. This implementation can be used to specify and develop correct software based on L-fuzzy relations such as fuzzy controllers. We give an overview of lattices, L-fuzzy relations, category theory and dependent type theory before describing our implementation. In addition, we provide examples of program executions based on our framework.
    • L-Fuzzy Structured Query Language

      Adjei, Evans; Department of Computer Science
      Lattice valued fuzziness is more general than crispness or fuzziness based on the unit interval. In this work, we present a query language for a lattice based fuzzy database. We define a Lattice Fuzzy Structured Query Language (LFSQL) taking its membership values from an arbitrary lattice L. LFSQL can handle, manage and represent crisp values, linear ordered membership degrees and also allows membership degrees from lattices with non-comparable values. This gives richer membership degrees, and hence makes LFSQL more flexible than FSQL or SQL. In order to handle vagueness or imprecise information, every entry into an L-fuzzy database is an L-fuzzy set instead of crisp values. All of this makes LFSQL an ideal query language to handle imprecise data where some factors are non-comparable. After defining the syntax of the language formally, we provide its semantics using L-fuzzy sets and relations. The semantics can be used in future work to investigate concepts such as functional dependencies. Last but not least, we present a parser for LFSQL implemented in Haskell.
    • Landscape Aware Algorithm Configuration

      Dennis, Cody; Department of Computer Science
      The issue of parameter selection cannot be ignored if optimal performance is to be obtained from an algorithm on a specific problem or if a collection of algorithms are going to be compared in a fair manner. Unfortunately, adequately addressing the issue of parameter selection is time consuming and computationally expensive. Searching for appropriate control parameters generally requires much more time than actually solving the problem at hand due to the need to perform many complete runs of the target algorithm. The number of runs required to obtain thorough and equal coverage of the parameter space grows exponentially with the number of parameters. As a result, costs associated with parameter selection become a limiting factor in the scale of problems that can be investigated. The primary goal of this work is to reduce the costs of parameter selection. In pursuit of this goal, this thesis examines the use of neural networks to intelligently select appropriate control parameter values based on the characteristics of the problem at hand. Two general purpose approaches are evaluated: one that predicts a single set of control parameters to use throughout a run of the target algorithm; and, another that dynamically adjusts algorithm control parameters at run time. These approaches are examined in detail using the Particle Swarm Optimization algorithm. A comparison with state of the art automated tools for control parameter selection indicates that the cost of parameter selection can be significantly reduced.
    • Learning Strategies for Evolved Co-operating Multi-Agent Teams in Pursuit Domain

      Grossi, Gina; Department of Computer Science
      This study investigates how genetic programming (GP) can be effectively used in a multi-agent system to allow agents to learn to communicate. Using the predator-prey scenario and a co-operative learning strategy, communication protocols are compared as multiple predator agents learn the meaning of commands in order to achieve their common goal of first finding, and then tracking prey. This work is divided into three parts. The first part uses a simple GP language in the Pursuit Domain Development Kit (PDP) to investigate several communication protocols, and compares the predators' ability to find and track prey when the prey moves both linearly and randomly. The second part, again in the PDP environment, enhances the GP language and fitness measure in search of a better solution for when the prey moves randomly. The third part uses the Ms. Pac-Man Development Toolkit to test how the enhanced GP language performs in a game environment. The outcome of each part of this study reveals emergent behaviours in different forms of message sending patterns. The results from Part 1 reveal a general synchronization behaviour emerging from simple message passing among agents. Additionally, the results show a learned behaviour in the best result which resembles the behaviour of guards and reinforcements found in popular stealth video games. The outcomes from Part 2 reveal an emergent message sending pattern such that one agent is designated as the "sending" agent and the remaining agents are designated as "receiving" agents. Evolved agents in the Ms. Pac-Man simulator show an emergent sending pattern in which there is one agent that sends messages when it is in view of the prey. In addition, it is shown that evolved agents in both Part 2 and Part 3 are able to learn a language. For example, "sending" agents are able to make decisions about when and what type of command to send and "receiving" agents are able to associate the intended meaning to commands.
    • Lossy Compression of Quality Values in Next-Generation Sequencing Data

      Suaste Morales, Veronica; Department of Computer Science
      In this work we address the compression of SAM files which is the standard output file for DNA assembly. We specifically study lossy compression techniques used for quality values reported in the SAM file and we analyse the impact of such lossy techniques in the CRAM format. We also study the impact of these lossy techniques in the SNP calling process. Our results show that lossy techniques allow a better compression ratio than the one obtained with the original quality values. We also show that SNP calling performance is not negatively affected. Moreover we confirmed that some of the lossy techniques can even boost the SNP calling performance.
    • Managing Diversity and Many Objectives in Evolutionary Design

      BASHER, SHEIKH FAISHAL; Department of Computer Science
      This thesis proposes a new approach to evolving a diversity of high-quality solutions for problems having many objectives. Mouret and Clune's MAP-Elites algorithm has been proposed as a way to evolve an assortment of diverse solutions to a problem. We extend MAP-Elites in a number of ways. Firstly, we introduce a many-objective strategy called sum-of-ranks, which enables problems with many objectives (4 and more) to be considered in the MAP. Secondly, we enhance MAP-Elites by extending it with multiple solutions per "grid" cell (the original MAP-Elites saves only a single solution per cell). A few different ways of selecting cell members for reproduction are also considered. We test the new MAP-Elites strategies on the evolutionary art application of image generation. Using procedural textures, genetic programming is used with upwards of 15 lightweight image features to guide fitness. The goal is to evolve images that share image features with a given target image. Our experiments show that the new MAP-Elites algorithms produce a large number of diverse solutions of varying quality. The extended MAP-Elites algorithm is also statistically competitive compared to vanilla GP in this application domain.
    • MDP-based Vehicular Network Connectivity Model for VCC Management

      Saad, Abubakar; Department of Computer Science
      Vehicular Cloud computing is a new paradigm in which vehicles collaboratively exchange data and resources to support services and problem-solving in urban environments. Characteristically, such Clouds undergo severe challenging conditions from the high mobility of vehicles, and by essence, they are rather dynamic and complex. Many works have explored the assembling and management of Vehicular Clouds with designs that heavily focus on mobility. However, a mobility-based strategy relies on vehicles' geographical position, and its feasibility has been questioned in some recent works. Therefore, we present a more relaxed Vehicular Cloud management scheme that relies on connectivity. This work models uncertainty and considers every possible chance a vehicle may be available through accessible communication means, such as vehicle-to-everything (V2X) communications and the vehicle being in the range of road-side units (RSUs) for data transmissions. We propose an markov-decisision process (MDP) model to track vehicles' connection status and estimate their reliability for data transmissions. Also, from analyses, we observed that higher vehicle connectivity presents a trace of repeated connection patterns. We reinforce the connectivity status by validating it through an availability model to distinguish the vehicles which support high availability regardless of their positioning. The availability model thus determines the suitability of the MDP model in a given environment.
    • Mixed Media in Evolutionary Art

      Maslen, Jordan; Department of Computer Science
      This thesis focuses on creating evolutionary art with genetic programming. The main goal of the system is to produce novel stylized images using mixed media. Mixed media on a canvas is the use of multiple artistic effects being used to produce interesting and new images. This approach uses a genetic program (GP) in which each individual in the population will represent their own unique solution. The evaluation method being used to determine the fitness of each individual will be direct colour matching of the GP canvas and target image. The secondary goal was to see how well different computer graphic techniques work together. In particular, bitmaps have not been studied much in evolutionary art. Results show a variety of unique solutions with the application of mixed media.
    • Modal and Relevance Logics for Qualitative Spatial Reasoning

      Ghosh, Pranab kumar; Department of Computer Science
      Qualitative Spatial Reasoning (QSR) is an alternative technique to represent spatial relations without using numbers. Regions and their relationships are used as qualitative terms. Mostly peer qualitative spatial reasonings has two aspect: (a) the first aspect is based on inclusion and it focuses on the ”part-of” relationship. This aspect is mathematically covered by mereology. (b) the second aspect focuses on topological nature, i.e., whether they are in ”contact” without having a common part. Mereotopology is a mathematical theory that covers these two aspects. The theoretical aspect of this thesis is to use classical propositional logic with non-classical relevance logic to obtain a logic capable of reasoning about Boolean algebras i.e., the mereological aspect of QSR. Then, we extended the logic further by adding modal logic operators in order to reason about topological contact i.e., the topological aspect of QSR. Thus, we name this logic Modal Relevance Logic (MRL). We have provided a natural deduction system for this logic by defining inference rules for the operators and constants used in our (MRL) logic and shown that our system is correct. Furthermore, we have used the functional programming language and interactive theorem prover Coq to implement the definitions and natural deduction rules in order to provide an interactive system for reasoning in the logic.
    • Modeling Metal Protein Complexes from Experimental Extended X-ray Absorption Fine Structure using Computational Intelligence

      Price, Collin; Department of Computer Science (Brock University, 2014-10-30)
      Experimental Extended X-ray Absorption Fine Structure (EXAFS) spectra carry information about the chemical structure of metal protein complexes. However, pre- dicting the structure of such complexes from EXAFS spectra is not a simple task. Currently methods such as Monte Carlo optimization or simulated annealing are used in structure refinement of EXAFS. These methods have proven somewhat successful in structure refinement but have not been successful in finding the global minima. Multiple population based algorithms, including a genetic algorithm, a restarting ge- netic algorithm, differential evolution, and particle swarm optimization, are studied for their effectiveness in structure refinement of EXAFS. The oxygen-evolving com- plex in S1 is used as a benchmark for comparing the algorithms. These algorithms were successful in finding new atomic structures that produced improved calculated EXAFS spectra over atomic structures previously found.