Dynamic coefficient-based cluster head election in wireless sensor networks

Owing to the fact that the selection of cluster heads has a significant effect on the lifetime of the network, many researches have proposed various cluster head election methodologies for cluster-based WSNs. Although recent studies have focused on adaptive approaches, in which different parameters are assembled under a function, the effect of these parameters on cluster head election is not investigated in detail. In this paper, initially, a small-scale dataset is constructed by evaluating the death of the first, the half and the last node in a cluster-based WSN using three popular cluster head parameters, including the remaining energy of the nodes, the intra-cluster communication cost and the number of neighbours. In consideration of the results, a dynamically changeable coefficient based adaptive cluster head election, DCoCH, is proposed. The coefficients of the cluster head election parameters show alteration within three different periods of the lifetime of the network. DCoCH is compared with two recent adaptive based cluster head election methodologies for various WSN parameters and the results show that DCoCH outperforms equivalent approaches for different values of the location of the base station, the size of the network, the number of the nodes, the initial batteries of the nodes and the distribution of the nodes.


Introduction
Since the energy efficiency is one of the main problems in Wireless Sensor Networks (WSNs) consisting of resource constrained sensor nodes, hierarchical routing protocols have been widely used in literature [1]. In hierarchical routing, the sensor nodes in the system form clusters in accordance with a decision, either central or distributed, and transmit data to selected cluster heads (CHs). After aggregating data gathered from the cluster member nodes, CHs send data either directly to the base station in a single-hop manner or to a relay node (typically another CH) in a multi-hop manner. Data transmission continues periodically until whole nodes in the network consume their batteries. Since replacing the batteries of the nodes is usually difficult and/or even impossible due to harsh environmental conditions, the energy of the nodes must efficiently be used for achieving longer network lifetime [2]. Due to the responsibilities of collecting data from member nodes, performing data aggregation functions and transmitting data to the base station, selecting the best candidates as CHs is an important issue for energy efficiency in cluster-based WSNs.
The CHs in the system are either selected by a resource constraint-free node, typically the base station, in a centralized manner or the determination process of CHs is conducted in each node of the network in a distributed manner. Due to providing scalability and reliability with energy and time efficiency, distributed CH election methodologies are more popular than their centralized alternatives [1], [2]. LEACH [16], in which the CHs are elected in a randomized manner without a central control mechanism, is regarded as the first fully distributed clustering approach. The CHs are elected in a randomized manner for each round in LEACH. Apart from the randomized methodology, two other approaches are used in literature called static and adaptive, for determining CHs. Super nodes, usually having higher batteries and more powerful processing capabilities than ordinary sensor nodes, are positioned in a predefined location for carrying out CH roles in static CH election process. Although this strategy minimizes energy consumption of the network, it is not suitable for WSNs, which usually include resource constrained homogenous sensor devices. Alternatively, in adaptive CH election methodology, various parameters including the remaining energy of the nodes, the distance of the nodes to the base station or to each other, the number of neighbouring nodes, are involved in CH election process. Due to providing adaptation to different networks and various environments, most of the recent studies in literature have used adaptive CH election strategy [3]- [15].
A node in the network usually generates a number (either by an adaptive or a randomized methodology) for making the decision of being a CH for related round independently of the others. In some studies [16]- [19], this number is compared with a threshold value and if the number is higher than the threshold, the node is elected as CH. Apart from the threshold approach, some studies [11], [12], [20], [21] like TB-LEACH [20], called time-based CH scheduling approaches, have used this value for deciding the sleeping period of a node in CH election process. Thus, the candidate nodes having the desired properties are provided an earlier awakening to announce its CH advertisement, while the nodes that do not satisfy the conditions are eliminated from CH election race.
In this paper, a time-based, distributed and adaptive CH election algorithm is proposed. Unlike various similar studies in literature, the coefficients of the parameters forming the CH election function is designated after investigating the effect of these parameters on the death of the nodes during the lifetime of the network. In consideration of evaluations, a dynamic coefficient-based CH election, DCoCH, is proposed. DCoCH is compared with existing time-based CH election algorithms for various values of different WSN parameters.
The remainder of this paper is organized as follows. Section 2 reviews recent adaptive CH election methodologies, while Section 3 explains the proposed approach. The results are evaluated and discussed in Section 4 and Section 5 concludes the paper.

Literature review
Since the election of CHs in hierarchical WSNs has a significant effect on efficient usage of energy in the network, there are many studies in literature focusing on selecting the best candidates as CHs by using various parameters. As is seen in Table 1, recent adaptive based CH election studies, in which the parameters are combined under a weighted function, have been reviewed due to the similarity to the topic of this paper.
The authors [3] have proposed DSBCA, in which the highest weighted node is elected as CH among k-hop neighbours. The weight of a node is determined by a function in which the parameters of the remaining energy, connection density, and the frequency of being elected as CH are controlled by several coefficients. Although the authors have stated that the coefficients should be in the range of 0 and 1, there is no more information about how they should be selected.
The nodes having the highest position metric (POS) are selected as CHs in CCWM [4]. POS is the sum of the multiplication of three functions, including the parameters of the number of neighbouring nodes, average energy of the neighbouring nodes and intra-cluster communication cost, with three different coefficients. The coefficients are selected as 0.3, 0.3 and 0.4 due to the requirement that the sum must be equal to 1. However, there is no more information about why these values are chosen.
In another study (MED-BS) [5], which a weighted function is used to ensure that the nodes having the highest remaining energy, having the highest degree and closest to the base station are selected as CHs, it is provided that all parameters have an equal effect due to disuse of coefficients.
In CDDP [6], the base station determines the CHs in a centralized manner according to a function consisting of the remaining energy, the number of neighbouring nodes and the frequency of being selected as CH. The parameters are multiplied by equal coefficients (i.e., 0.33).
Before combining the weighted parameters; the number of neighbours, the remaining energy and intra-cluster communication cost parameters are multiplied by 0.3, 0.4 and 0.3, respectively, in optimized WCA [7]. The nodes having the highest value according to the function are elected as CHs. The authors have stated that desired parameters can be ensured to be dominant on CH election by scaling the coefficients. However, the values of the coefficients are close to each other in optimized WCA.
In DEHCIC [8], the authors have considered the remaining energy of the nodes, the number of 2-hop mobile and static neighbours and the number of 1-hop mobile and static neighbours for CH selection. Initially, the battery of each node is compared with a predefined threshold value. The nodes having less energy than the threshold value cannot participate in CH election process. The CHs are then determined by a function consisting of various hop numbers, static and mobile neighbouring numbers multiplied by certain coefficients. Although the CHs are elected according to network topology and the remaining energy of the nodes, the authors have not stated which parameter has how much effect.
The authors have proposed the lowest weighted nodes are elected as CHs in WDARS [9]. The weight of node is calculated by a function composing of multiplication of three different parameters with three coefficients having the sum of 1 and then combining them. The parameters are the remaining energy of the nodes, distance to the base station and distance to the weighted tree.
The remaining energy of the nodes and the node densities are used for CH election in an iterative manner in MLHEED [10]. The authors have proposed a time-based sleeping mechanism by using a weighted function in PEECR [11] algorithm. The remaining energy of the nodes, the number of neighbours and intra-cluster communication cost constitute the function. The parameters are multiplied with coefficients having a sum of 1. Similarly, the parameter of the remaining energy of the nodes is used for determining the sleeping period of the nodes in CATD [12].
In PSO-ECHS [13], the authors have aimed to minimize two functions, and , which are used for determining the CHs, by particle swarm optimization (PSO) algorithm.
is a function of average intra-cluster communication cost and the distance of the CHs to the base station, while is the remaining energy of the CHs. These two functions are multiplied with two coefficients having a sum of 1. These coefficients specify the effect of the parameters of distance and energy on CH election. The authors have stated that different values of coefficients are evaluated and the value of 0.3 for distance and the value of 0.7 for energy shows the best results. However, there is not a detailed information about the samples and the result evaluation methodology.
In CAMP [14] algorithm, the authors have stated that the coefficients, which control the weights of three parameters, including the number of neighbouring nodes, the remaining energy and the distance to the base station, on CH election, can diversify according to QoS requirements of applications. The death of the first node is investigated for 6 different values of coefficients. The authors have indicated that the applications using throughput for determining QoS should use higher values for the coefficient of node degree; the applications without delay-tolerant should use higher values for the coefficient of distance to base station and the applications aiming to prolong the lifetime of the network use higher values for the coefficient of energy.
In NODIC [15], the authors have proposed a LEACH-like probabilistic CH election strategy by using two parameters, remaining energy of the nodes and intra-cluster communication cost, during 2 periods of network lifetime instead of combining them in an adaptive function. The energy parameter is utilized for CH election until the energy of the nodes decreases under a predefined threshold value. After that round, intra-cluster communication cost is used for CH election. Although NODIC has proposed alteration of CH election parameters only once during the lifetime of the network, only one parameter is chosen to elect CHs and the parameter of the number of neighbours is not considered. The CHs are either elected by remaining energy or intra-cluster communication cost. Besides, CH election is stochastic, which means that the desired CH is not guaranteed to be elected. Moreover, the alteration is made to the parameters, not their coefficients and only energy of the nodes is considered for this alteration.
It can clearly be said that recent studies in literature prefer distributed clustering to centralized approach and the remaining energy of the nodes is the most popular parameter used in CH election process (Table 1). Although some of them have indicated that the coefficients of the parameters used in function can vary, the effect of these coefficients on CH election is not investigated in detail. In this study, differently from the studies in literature, the coefficients of the parameters used in adaptive function is proposed to dynamically be weighted. Although various parameters have been used in CH election process in literature, there is a lack of a comprehensive study investigating the effect of these parameters on the lifetime of the network. Therefore, a dataset is created for different potential values of three parameters, including intra-cluster communication cost, the number of neighbouring nodes and the remaining energy of the nodes. The death of the first, the half and the last nodes are reported for all values of dataset.
Depending on the results obtained, these parameters are included in an adaptive function with being multiplied by dynamically alternating coefficients based on the situation of the network. Due to the usage of a time-based clustering like TB-LEACH, desired parameters are guaranteed to be elected as CHs. The proposed scheme is compared with different timebased clustering approaches, including PEECR and CATD, on the death of the first and the last node for different values of WSN parameters. In this regard, the contributions of the proposed method, DCoCH, to the literature can be listed as follows.


Combining a time-based sleeping mechanism with an adaptive CH selection algorithm with dynamic coefficients changing over 3 different time periods throughout the lifetime of the network.
 Creating a small-scale dataset by evaluating the rounds that the first, the half, and the last are dead for different coefficients of the 3 most frequently chosen parameters in literature: intra-cluster communication range, the number of neighbouring nodes and the remaining energy of the nodes.
 A detailed evaluation of the proposed scheme with the recent equivalent approaches over different values of various WSN parameters including the number, the initial battery and the distribution of nodes, the size of the network and the location of the base station.

Analysing the parameters of CH election
The studies in literature have been used several parameters for CH election. Although the formulization of these parameters varies, the most popular ones are the remaining energy of the nodes, intra-cluster communication cost and the number of neighbouring nodes. Apart from the issue of how to assemble these parameters in a function, the effect of the weight of these parameters on network lifetime should be analysed. For this purpose, a small-scale dataset is constructed in this study, including sixty-six (66) possible values between 0 and 1 of three parameters, including the remaining energy of the nodes, intra-cluster communication cost and the number of neighbouring nodes. For each record of dataset, the death round of the first (FND), the half (HND) and the last nodes (LND) are estimated, as is shown in Figure 1, Figure 2 and Figure 3, respectively. It can clearly be observed in Figure 1 that FND reaches up to its highest values for lower values of intracluster communication cost and higher values of the remaining energy of the nodes. However, the exact opposite situation occurs for LND ( Figure 3). The higher values of intra-cluster communication cost results in higher values of LND. Although the number of neighbouring nodes affects HND at most among all parameters ( Figure 5), there is not a distinct domination as is seen in FND or LND. In the light of the results obtained, Table  2 is created. In the proposed algorithm, DCoCH, the weights of the parameters are chosen dynamically according to Table 2. The lifetime of the network is regarded as three sections. The first section starts with the beginning of the simulation and ends when the first node is dead. In this section, the weights of the parameters used in CH election are determined by the first row in Table 2. Similarly, the second (third) section starts with the death of the first (half) node and is over with the death of the half (last) node. The second row in Table 2 is used for second section, while the third one is used for the last section.

Time-Based clustering
The nodes compare the randomized number they generate at the beginning of each round, with a predefined threshold value in LEACH [16]. If the randomized number is higher than the threshold, the node is elected to be a CH. However, in TB-LEACH [20], this randomized number is used to specify the sleeping time of the nodes before joining the CH election process. Although in TB-LEACH [20] the authors have proposed a randomized number generation, recent studies have usually used an adaptive CH approach. For instance, as is seen in Equation (1), in CATD [12], the waiting time of a node to announce itself as a CH, is determined by using the energy of the nodes. In Equation (1), is the waiting time of a node, is a random number between 0 and 1. is the residual battery of the node; is the sum of initial batteries of the nodes in the network; is the round number; is the sum of the energy consumed by a CH and a non-CH in a round and is the number of nodes. However, in PEECR [11], intra-cluster communication distance, the number of neighbours and the energy of the nodes are included in CH election process (Equation (2)). In Equation (2), represents the scale factor; , and are coefficients; is intra-communication distance and is the number of neighbours. Even though the same parameters as PEECR [11] are used in this paper, these parameters are combined in a different function as is seen in Equation (3). In Equation (3), is the intra communication range of a node. Both in PEECR and the proposed approach, the parameters of the batteries of the nodes, intra-communication cost and the number of neighbouring nodes affect the election of CH. In the proposed approach, apart from PEECR, when combining these parameters, the ratio of the values of the related parameters of node to the maximum possible values of the parameters is preferred.
In a cluster-based routing protocol, the lifetime of the network is divided into iterative rounds. In each round, after electing the cluster heads and forming the clusters, data transmission stage, in which the member nodes (MNs) send data to related CHs and CHs send whole aggregated data to BS, begins. Figure 4 demonstrates clustering stage of the network. Each node broadcasts necessary information to in a SELF_INFO message before calculating . When the simulation time ( ) reaches , the node decides to become whether a CH or a MN. If the node receives at least one cluster head advertisement message (CH_ADV) before awakening, it means that a better candidate within has become a CH. In this circumstance, the node decides to be a MN and selects the best CH to join. In literature, usually the closest CH is preferred among all candidates to reduce the energy consumption of MN and CH communication. When each node decides its state for related round, MNs send join request messages (JOIN_REQ) to elected CHs. After collecting requests from members, each CH creates a TDMA schedule for data transmission of its member nodes and send a join acknowledgement (JOIN_ACK) message to MNs including the time slot to transfer their data. After whole data reaches to BS, a new round begins.

Clustering Algorithm
// Each node in the system performs the following.  When a node consumes 95% of its initial battery, the node is assumed to be dead and cannot participate in network processes any more. In literature, the lifetime of the network is estimated through the death round of the first (FND), half (HND) and the last (LND) nodes [22].

The proposed approach: DCoCH
When analysing the effect of the parameters on CH election (Section 3.1), it is observed that for increasing the lifetime of the network, the importance of a parameter on CH election depends on the simulation time (or round). Therefore, differently from the studies in literature, the values of , and changes during the lifetime of the network in DCoCH. The alteration of the coefficients of CH election parameters during the lifetime of the network is shown in Table 3. The parameters take different values in three different time periods of the network. The first period starts with the beginning of the simulation and ends when the first node is dead (i.e. FND). The second period is between FND and HND. The last period continues from the end of the previous one to the end of the simulation (i.e. LND). Since the sum of these coefficient are equal to 1, the possible values can be divided into three equal intervals, including low, medium and high, between 0.1 and 0.9.  Table 2 can be seen in Table 3. The assumptions of the proposed model can be listed as follows.
1. The nodes in the system are homogeneous, i.e. having the same initial energy, stationary, location-aware and randomly deployed in the network area.
2. The nodes can adjust their transmission power to transfer data.
3. The batteries cannot be recharged or changed during the lifetime of the network.
4. There is only one BS in the system, which is also stationary and is an energy-constrained free node.
The radio model used in DCoCH is first order radio model (FORM), proposed in LEACH, and the details of this model can be found in [16].
For avoiding the collisions in intra-cluster communication (i.e. data transmission between MNs and their CHs), time division multiple access (TDMA) is used. For data transmission in a cluster, each MN has its unique time slot assigned by the CH to transfer its sensed data. When an active MN sends data to its CH at a time, the others switch to passive mode (i.e. sleeps). Besides, code division multiple access (CDMA) is preferred for overcoming inter-cluster collisions (i.e. data transmission between CHs and the base station). Each CH chooses a random spreading code from a list and use this code for changing the incoming signals. Due to using different codes, collisions in inter-cluster transmission can be minimized.

Analysing the performance of DCoCH
In order to analyse the performance of DCoCH, FND and LND is estimated by conducting extensive simulations with OMNeT++ and the results are compared with PEECR and CATD for various WSN parameters, including the number of the nodes, the initial batteries of the nodes, the distribution of the nodes, the size of the network and the location of the base station (Table 4). In Table 4, shows the width, while shows the height of the network area. The ten different distributions of a 100 nodenetwork having a size of 200 x 200 can be seen in Figure 5.   Similarly, Figure 8 shows FND and LND of DCoCH, PEECR and CATD for five different node numbers, including centre (100, 100), corner (0, 200), border (0, 100), outer-1 (250, 50) and outer-2 (50, -100), on a 200 x 200 network having 100 nodes. FND and LND of DCoCH, PEECR and CATD for four different network area, including 100 x 100, 300 x 300, 400 x 400 and 500 x 500, can be seen in Figure 9; while Figure 10 shows that of for three different initial batteries, including 0.5 Joule, 1 Joule and 2 Joule on a 500 x 500 network. As is seen in Figure 6, DCoCH yields better performance both for FND and LND than other protocols regardless of the distribution of the nodes. DCoCH with the nodes having either a uniform ( Figure 5-e) or a non-uniform ( Figure 5-c) distribution, ranks foremost among its competitors. The performance of DCoCH on FND is 15% better then PEECR at minimum; 213% better then PEECR at maximum and 72% better then PEECR on average (Figure 6-a).
Similarly, when DCoCH and CATD is compared, these rates take the values of 11%, 75% and 34%, respectively (Figure 6-a). Besides, if the lifetime of the network is evaluated over LND, DCoCH is still preferable due to providing between 2% and 14% better performance than PEECR and between 21% and 37% better performance than CATD (Figure 6-b). Although CATD achieves higher performance than PEECR for FND and PEECR performs better than CATD for LND, DCoCH performs the best among all others for both FND and LND.
DCoCH yields up to 50% better performance in terms of FND and up to 30% better performance in terms of LND than PEECR and CATD for small-to-medium scale node numbers ( Figure 7). Regardless of the location of the base station, i.e. either in the network area, or outside the network area, DCoCH outperforms the other approaches both for FND and LND (Figure 8). The performance increase of DCoCH in terms of FND reaches up to 42% than PEECR if the base station locates at the centre of the network; 117% than CATD if the base station is far away from the network area. Although the rates for LND is not as high as FND, DCoCH still outperforms PEECR and CATD in terms of LND at the rates of 7% and 25% on average.
The performance of DCoCH for FND reaches up 50% better than PEECR for a 100 x 100 network area (Figure 9-a). However, this performance increase decreases as the size of the network increases. Since the nodes consume much more energy for both intra and inter communication for larger values of the network area, the death of the first node is moved to an earlier round regardless of the clustering protocol. Therefore, the performance difference between the clustering approaches is hardly observed if the initial batteries of the nodes have relatively lower values for large scale networks. Despite this, DCoCH shows equal performance even for the worst-case scenario with the other approaches (i.e. FND for all protocols is 95 for 300 x 300 network). On the purpose of evaluating the exact performance of the protocols, different initial batteries are used for higher network sizes, i.e. 500 x 500 ( Figure 10). When LND is analysed for different network sizes, even though DCoCH outperforms PEECR, their performance is close to each other. However, DCoCH yields up to 43% better performance than CATD.
The performance difference between DCoCH and other protocols for LND increases as the initial batteries of the nodes increases on a 500 x 500 network ( Figure 10). DCoCH outperforms PEECR up to 5% (17%) and CATD up to 27% (27%) for LND (FND). In conclusion, the results show that dynamic coefficient based adaptive CH election strategy performs better than equivalent adaptive approaches for various values of node number, network size, initial battery of the nodes, base station location and for different node distributions.

Conclusion
In this paper, an adaptive CH election methodology based on dynamically selected coefficients of parameters is proposed for cluster-based WSN architectures. In order to determine the coefficients and their alteration frequency, a small-scale dataset is conducted. Dataset is obtained by simulating and evaluating the death of the first, the half and the last node of a clusterbased WSN system using 3 parameters, including the neighbour number, remaining battery of a node and intra-cluster communication distance, for CH election. According to dataset, the effect of the parameters on the lifetime of a node is investigated and hence, a dynamic coefficient usage for three different timelines of the network is proposed. The proposed algorithm, DCoCH is compared with two recent adaptive based CH election approaches for various WSN parameters, including node number, network size, initial battery of the nodes, location of the base station and node distributions in terms of the death of the first and the last node. The results show that for all situations DCoCH outperforms equivalent CH election strategies.