Partitioning is more a generic term for dividing data across tables or databases. Sharding is one specific type of partitioning, part of what is called horizontal partitioning.
Here you replicate the schema across (typically) multiple instances or servers, using some kind of logic or identifier to know which instance or server to look for the data. An identifier of this kind is often called a “Shard Key”.
A common, key-less logic is to use the alphabet to divide the data. A-D is instance 1, E-G is instance 2 etc. Customer data is well suited for this, but will be somewhat misrepresented in size across instances if the partitioning does not take in to account that some letters are more common than others.
Another common technique is to use a key-synchronization system or logic that ensures unique keys across the instances.
A well known example you can study is how Instagram solved their partitioning in the early days (see link below). They started out partitioned on very few servers, using Postgres to divide the data from the get-go. I believe it was several thousand logical shards on those few physical shards. Read their awesome writeup from 2012 here: Instagram Engineering – Sharding & IDs
See here as well: http://www.quora.com/Whats-the-difference-between-sharding-and-partition