Cloud Computing is a new computing paradigm that support some user properties or requirement: it supports delivery of computing services on minimal charges without installing them at local sites. It is considered as an agregation of many existing technologies like parallel and distributed computing, ServiceOriented-Architecture, virtualization, networking, etc.
In Cloud computing, services are delivered over internet in an on-demand elastic way for which the charges are paid at release time of resources. Cloud Computing has attracted researchers as an alternative to supercomputers for high performance computing. It enables access to leased computing power and storage capacity.
In this topic, we try to revisit some problems and challenges of parallel and distributed systems in the context of Cloud Computing. These problems are subdivised into 3 sub-topics: Fault Tolerance, Programming Models and Performance Evaluation (see description of these sub-topics).
Fault tolerance is an important property for large scale computational systems, where geographically distributed nodes co-operate to execute tasks. Due to the number of components and their diversity, nodes, networks, disks and applications frequently fail, restart, disappear and behave unexpectedly. As the number of Cloud system components increases, the probability of failures becomes higher than that in a traditional parallel or distributed computing environment. Hence, support for the development of fault tolerant applications has been identified as one of the major technical challenges to address for the successful deployment of computational. Since the failure of resources affects job execution fatally, fault tolerance service is essential to satisfy QoS requirement in Cloud Computing. Commonly utilized techniques for providing fault tolerance are job checkpointing, load balancing and replication.To study this problem, we proposed a dynamic colored graph for representing a Cloud infrastructure.
With the popularity on Cloud Computing paradigm, it is a challenge to provide a proper programming model which is able to support convenient access to large scale data for performing computations while hiding all low-level details of physical environments. Cloud programming model defines what and how to program on Cloud platforms. Cloud platforms provide only the basic local functions that an application program requires. The Cloud programming model depends on what Cloud platform we use: SaaS, PaaS or IaaS. A programming model contains generally three components: a language programming, at set of libraries and a runtime system to create a model of computation or an abstract machine. In this sub-topic, we use web services, VM and framework models to define and experiment programming models (Hadoop, MapReduce, Spark). As example of large scale applications, we use essentielly datamining and clustering algorithms.
Performance is the main characteristic when we adopt a Cloud solution. The Cloud must provide improved performance when a user moves to Cloud Computing infrastructure. Performance is generally measured by capabilities of applications running on a Cloud infrastructure. Poor performance can be caused by a bad resource management policy, limited bandwidth, lower CPU speed, memory, network connections etc. Many times users prefer to use services from more than one cloud (Multi-cloud infrastructure). In this situation, some applications are located on private Clouds while some other data or