Friday, April 27, 2012

Some useful terms - before going into particulars

Some useful terms before going into details of Distributed Systems

Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility. It could be data replication if the same data is stored on multiple storage devices, or computation replication if the same computing task is executed many times. A computational task is typically replicated in space, i.e. executed on separate devices, or it could be replicated in time, if it is executed repeatedly on a single device. (WIKI)

* Replicating user's data in different servers minimizes inter-server traffic for reads, but increases replication overhead !
(This has a negative impact on ;
  •  query execution times
  •  network traffic for updates and 
  • maintaining consistency across replicas.

 Stateless application : is kind of an application that does not remember the user's previous actions. Eg: WorldWideWeb . However making the application remember previous actions of user is pretty useful . That's why, some extra mechanisms are being developed to remember configuration settings such as enabling cookies in HTTP protocol.

Scalability :  A popular buzzword that refers to how well a hardware or software system can adapt to increased demands. For example, a scalable network system would be one that can start with just a few nodes but can easily expand to thousands of nodes. Scalability can be a very important feature because it means that you can invest in a system with confidence you won't outgrow it.  (

Scaling ? There are two definitions for making a system scalable.

1) Vertical Scaling = Eg. adding more partitions - upgrading existing hardware. (Facebook scales its system across thousands of machines) * Vertical Scaling is generally too expensive!

2) Horizontal Scaling = It is a more cost effective approach. (Eg. adding more commodity servers into the system so that the workload can be shared among them ) *Thanks to the developments in Cloud computing, now it is easier to reinforce your system by removing the need to your own hardware and adding more VMs from the cloud. * Application front end is stateless which is a plus because the application can be started on new servers on demand. On the other hand, application back end keeps state which is a minus ( this is related to partition of data, if data is partitioned into separated divisions, horizontal scaling still exists
 /This disadvantage of keeping state in backend might be explained comprehensively ! Please send me an email if you have a good example /

To be continued ...

What is Bloom filter ?




 Hashing ? and Inconsistent Hashing ?





 Attacks that we always hear about 

Sybil Attacks : 

The Sybil attack in computer security is an attack wherein a reputation system (like e-bay)  is subverted by forging identities in peer-to-peer networks.


What is dirty bit?

Sometimes CPU modifies a bit but does not write it back to the storage. In this case, that modified bit is called Dirty Bit. This bit is located in a cache or virtual storage.


References : The Little Engine(s) That Could: Scaling Online Social Networks Josep M. Pujol, Vijay Erramilli, Georgos Siganos, Xiaoyuan Yang Nikos Laoutaris, Parminder Chhabra, Pablo Rodriguez Telefonica Research { jmps, vijay, georgos, yxiao, nikos, pchhabra, pablorr }

No comments:

Post a Comment