Design and Development of Computer Specification Recommendation System Based on User Budget With Genetic Algorithm

,


INTRODUCTION
According to International Data Corporation (IDC), in 2016, overall sales of personal computer in Indonesia reached 2 million units.This indicates the need for computers is still quite a lot in Indonesia, especially for office and gaming needs.
In general there are two types of computer sold in computer stores, the built-up computers and the assembled computers.The built-up computers are computers directly made by the manufacturers of the computers, while the assembled computers are computers assembled by technicians base on buyer's request.
Based on a preliminary survey, about 72.5% of respondents who prefer to buy built-up computers reasoned they bought the computers because they don't know computer components.While 86.8% of the the respondents who prefer to buy assembled computers reasoned they bought the computer because they could adjust the price of the computer based on their budget.
The preliminary survey conclude that people bought built-up computers because they don't know computer components well.If there is a recommendation system able to give components specification of compatible computers based on their budget, they would buy an assembled computer so they can adjust the price of the computer base on their budget, and they no longer need to know computer components.
Researches related to the this problem has been done by some other researchers.Imbar in 2013 [1] select the specification of computer components based on the budget of the buyer.However, the research based on Greedy Algorithm produced sub-optimum solution because the algorithm did not operate thoroughly against all available alternative solutions.
Another research conducted by Haryanty in 2012 [2] using Genetic Algorithm with Roulette Wheel selection method and the result was producing combination of products optimal to buyer's budget.However, this research only used five computer components i.e processor, motherboard, memory (RAM), graphics card and hard disk; and did not check the compatibility of RAM according to the motherboard used.
This research used Genetic Algorithm to make improvement from the previous research by using seven computer component products and used Tournament Selection for the selection method which has an advantage in convergence speed compared to proportionate roulette wheel.It is expected to help people when buying assembled computer with the various of computer components and the incomprehension of computer components compatibility.

A. Genetic Algorithm
Genetic algorithms (GAs) are search methods based on principles of natural selection and genetics.
GAs encode the decision variables of a search problem into finite-length strings of alphabets of certain cardinality.The strings which are candidate solutions to the search problem are referred to as chromosomes, the alphabets are referred to as genes and the values of genes are called alleles.For example, in a problem such as the traveling salesman problem, a chromosome represents a route, and a gene may represent a city.To evolve good solutions and to implement natural selection, measure to distinguish good solutions from bad solutions is needed.The measure could be an objective function that is a mathematical model or a computer simulation, or it can be a subjective function where humans choose better solutions over worse ones.The fitness measure must determine a candidate solution's relative fitness, which will subsequently be used by the GA to guide the evolution of good solutions [1].
Another important concept of GAs is the notion of population.Unlike traditional search methods, genetic algorithms rely on a population of candidate solutions.The population size, which is usually a user-specified parameter, is one of the important factors affecting the scalability and performance of genetic algorithms.For example, small population sizes might lead to premature convergence and yield substandard solutions.On the other hand, large population sizes lead to unnecessary expenditure of valuable computational time.
Once the problem is encoded in a chromosomal manner and a fitness measure for discriminating good solutions from bad ones has been chosen, the algorithm start evolving solutions using the following steps: 1. Initialization.The initial population of candidate solutions is usually generated randomly across the search space.However, domain-specific knowledge or other information can be easily incorporated.

Evaluation. Once the population is initialized
or an offspring population is created, the fitness values of the candidate solutions are evaluated.
3. Selection.Selection allocates more copies of those solutions with higher fitness values and thus imposes the survival of the fittest mechanism on the candidate solutions.The main idea of selection is to prefer better solutions to worse ones, and many selection procedures have been proposed to accomplish this idea, including roulette-wheel selection, stochastic universal selection, ranking selection and tournament selection.
4. Recombination.Recombination combines parts of two or more parental solutions to create new, possibly better solutions (i.e.offspring).There are many ways of accomplishing this (some of which are discussed in the next section), and competent performance depends on a properly designed recombination mechanism.The offspring under recombination will not be identical to any particular parent and will instead combine parental traits in a novel manner (Goldberg, 2002).
5. Mutation.While recombination operates on two or more parental chromosomes, mutation locally but randomly modifies a solution.
Again, there are many variations of mutation, but it usually involves one or more changes being made to an individual's trait or traits.In other words, mutation performs a random walk in the vicinity of a candidate solution.
6. Replacement.The offspring population created by selection, recombination, and mutation replaces the original parental population.
Many replacement techniques such as elitist replacement, generation-wise replacement and steady-state replacement methods are used in GAs.

B. Tournament Selection
GAs uses a selection mechanism to select individuals from the population to insert into a mating pool.Individuals from the mating pool are used to generate new offspring, with the resulting offspring forming the basis of the next generation.A selection mechanism in GA is simply a process that favors the selection of better individuals in the population for the mating pool.The selection pressure is the degree to which the better individuals are favored: the higher the selection pressure, the more the better individuals are favored.This selection pressure drives the GA to improve the population fitness over succeeding generations.The convergence rate of a GA is largely determined by the selection pressure, with higher selection pressures resulting in higher convergence rates.However, if the selection pressure is too low, the convergence rate will be slow, and the GA will unnecessarily take longer to find the optimal solution.If the selection pressure is too high, there is an increased chance of the GA prematurely converging to an incorrect (suboptimal) solution.
Tournament selection provides selection pressure by holding a tournament among "s" competitors, with "s" being the tournament size.The winner of the tournament is the individual with the highest fitness of the "s" tournament competitors.The winner is then inserted into the mating pool.The mating pool, being comprised of tournament winners, has a higher average fitness than the average population fitness.This fitness difference provides the selection pressure, ISSN 2355-0082 which drives the GA to improve the fitness of each succeeding generation.Increased selection pressure can be provided by simply increasing the tournament size "s", as the winner from a larger tournament will, on average, have a higher fitness than the winner of a smaller tournament [2].

C. Uniform Crossover
Uniform crossover do not fragment the chromosomes for recombination.Each gene in offspring is created by copying it from the parent chosen according to the corresponding bit in the binary crossover mask of same length as the length of the parent chromosomes.If the bit in crossover mask is 1, then the resultant gene is copied from the first parent and if the bit in crossover mask is 0, then the resultant gene is copied from the second parent.A new crossover mask is generated arbitrarily for each pair of parent chromosomes.The quantity of crossover point is not fixed initially.So, the offspring have a mixture of genes from both the parents [3].

D. Uniform Mutation
The mutation operator randomly selects a position in the chromosome and changes the corresponding allele, thereby modifying information.The need for mutation comes from the fact that as the less fit members of successive generations are discarded; some aspects of genetic material could be lost forever.By performing occasional random changes in the chromosomes, GAs ensure that new parts of the search space are reached, which reproduction and crossover alone couldn't fully guarantee [4].Uniform Mutation can be done using the following steps [5].Choose one gene randomly.Replace the value of a chosen gene with a uniform random value selected between the user specified upper and lower bounds for that gene.

A. Use Case Diagram
Use case diagrams are usually referred to as behavior diagrams used to describe a set of actions (use cases) that some system or systems (subject) should or can perform in collaboration with one or more external users of the system (actors).Each use case should provide some observable and valuable result to the actors or other stakeholders of the system.

A. Computer Specification Recommendation System
Users can define the computer components that will be used for getting a recommendation.On the next page shown all of the components and their minimum specification of products.User will input budget and waiting until the Algorithm has found the solution.In Figure 3 the user has input IDR 3,500,000 to the budget.The combination of computer components will be show after the algorithm has found solution.The compatibility for each component have been checked automatically by the application so user don't have to worry about compatibility issues.

B. Testing
Testing has been conducted by giving input IDR 3,500,000 to budget which is the worst case for the algorithm.The test used different population size and was performed for 30 times for each population size and processor's socket.The test set mutation rate and crossover rate 0.05 and 0.6 respectively.It is observed that the optimal values for mutation probability (0.001) and single point crossover with probability (0.6) with population size (50-100) as suggested by DeJong (1975) have been used in many GA implementations.Mutation probability above 0.05 is in general harmful for the optimal performance of Gas [6].

Fig 1
shows the Use Case Diagram of the application.

Fig 1 .
Fig 1. Use Case Diagram B. Class Diagram Class Diagram is UML structure diagram which shows structure of the designed system at the level of classes and interfaces, shows their features, constraints and relationships -associations, generalizations, dependencies, etc. Fig 2 shows the Class Diagram of the application.

Fig 3 .
Fig 3. Defining ComponentsAfter user has defined the components, he/she will define the minimum specification of products for each component.

Fig 4 .
Fig 4. Defining Minimum Specification of Products

Table 1 .
Amount of Data Used in the Test

Table 2 .
Test Result by Using 10 Population Size

Table 3 .
Test Result by Using 30 Population Size