This area does not yet contain any content.
Saturday
Sep242011

Cassandra - Split brain schemas

It's possible to end up with a Cassandra cluster where there are two different schemas running on the nodes in the ring, I don't know how, but I have managed it so it's definitely possible :) - I have knack for breaking things in strange ways.

In case anyone comes a cropper on the same issues, I've discovered some useful things whilst trying to resolve this issue. Firstly these two facts:

* Cassandra (not you) decides which version of the schema can over-write other nodes schemas, I think this is based on a timestamp (or similar). Even if Cassandra is not automatically resolving the schema issue, there is still only one schema that can spread, and AFAIK you can't select which one it is.

* Just because there are more nodes with a particular schema version, does not neccessarily mean that this schema version is the most up-to-date one. 

With that knowledge, how do you resolve the dual schema problem?

The first thing I would suggest doing is taking a node that has the same schema version as another node, wiping the data and seeing which schema version Cassandra loads back onto it. Okay, this isn't really my idea, some guy on the Cassandra IRC channel on freenode suggested I do this, but it was very helpful!

After the node is reset and the schema is loaded take a note of the schema version that got loaded onto the node. It may be the schema that was already on it, or it may be the other schema, either way you now know which schema is the up-to-date one.

Saturday
Sep242011

Postgresql - Finding unused indexes

Database indexes are very useful, but the down side is that updating or adding rows to tables becomes more intensive as all the indexes also need updating.

If you are using postgresql then you probably already have access to information on index usage as this information is usually collected by default - if not, you can find more information here in the postgresql docs on statistics collection.

This query (tested on PG9) gives some information on index usage and index size:

SELECT t.tablename, 
  indexname, 
  c.reltuples AS num_rows, 
  pg_size_pretty ( pg_relation_size ( t.tablename::text ) ) AS table_size, 
  pg_size_pretty ( pg_relation_size ( indexrelname::text ) ) AS index_size, 
  CASE 
    WHEN x.is_unique = 1 
    THEN 'Y' 
    ELSE 'N' 
  END AS UNIQUE, 
  idx_scan AS number_of_scans, 
  idx_tup_read AS tuples_read, 
  idx_tup_fetch AS tuples_fetched 
FROM pg_tables t 
LEFT 
OUTER JOIN pg_class c 
  ON t.tablename = c.relname 
LEFT 
OUTER JOIN ( 
      SELECT indrelid, 
        max ( CAST ( indisunique AS integer ) ) AS is_unique 
      FROM pg_index 
      GROUP BY indrelid ) x 
  ON c.oid = x.indrelid 
LEFT 
OUTER JOIN ( 
      SELECT c.relname AS ctablename, 
        ipg.relname AS indexname, 
        x.indnatts AS number_of_columns, 
        idx_scan, 
        idx_tup_read, 
        idx_tup_fetch, 
        indexrelname 
      FROM pg_index x JOIN pg_class c 
        ON c.oid = x.indrelid JOIN pg_class ipg 
        ON ipg.oid = x.indexrelid JOIN pg_stat_all_indexes psai 
        ON x.indexrelid = psai.indexrelid ) AS foo 
  ON t.tablename = foo.ctablename 
WHERE t.schemaname = 'public' 
ORDER BY number_of_scans desc; 

If you assume that all indexes that are needed have been used at least once, this next query tells you how big all the unused indexes are:

SELECT pg_size_pretty ( sum ( pg_relation_size ( indexrelname::text ) ) ::bigint ) AS index_size 
FROM pg_tables t 
LEFT 
OUTER JOIN pg_class c 
  ON t.tablename = c.relname 
LEFT 
OUTER JOIN ( 
      SELECT indrelid, 
        max ( CAST ( indisunique AS integer ) ) AS is_unique 
      FROM pg_index 
      GROUP BY indrelid ) x 
  ON c.oid = x.indrelid 
LEFT 
OUTER JOIN ( 
      SELECT c.relname AS ctablename, 
        ipg.relname AS indexname, 
        x.indnatts AS number_of_columns, 
        idx_scan, 
        idx_tup_read, 
        idx_tup_fetch, 
        indexrelname 
      FROM pg_index x JOIN pg_class c 
        ON c.oid = x.indrelid JOIN pg_class ipg 
        ON ipg.oid = x.indexrelid JOIN pg_stat_all_indexes psai 
        ON x.indexrelid = psai.indexrelid ) AS foo 
  ON t.tablename = foo.ctablename 
WHERE t.schemaname = 'public' 
  AND idx_scan =0; 

You can inverse that last query to get the total size of all the in-use indexes by changing the idx_scan=0 to idx_scan>0. If you are heavily reliant on the database fitting into physical RAM (as I usually am) then you may want to keep an eye on the index size, along with the size of your tables.

If you are looking to reduce the size of the indexes you could look at some of your biggest indexes and think about the queries that actually run against the table - do you really need to index the *whole* table for your queries? If not, you could look at changing the index to a partial index which is basically an index which only covers part of a table based on a where clause.

There are a few other things that can help get down index size , ensure the auto vacuum setting are aggressive enough, dump-restore your database periodically, delete rows in the underlying table, etc, :)

Saturday
Aug082009

Using XCode with SVN - some Gotchas

I mainly do Java development using Eclipse with SVN but recently I've been playing around with a bit of iphone development using XCode.

I've included some below some Gotchas about using XCode with SVN in the hope that it might save someone else making the same mistakes as me...

  • The left hand navigator pane in xcode labelled files and groups does not show a filesystem view of the folder containing the project files.
  • XCode projects need files adding in manually - the project structures does not have to match file system directory structures at all.
  • You will probably want to exclude the build directory - XCode will not do this for you.
  • Be aware that's it's easy to accidentily create links to files in the projects rather than putting the files into the project folder itself. Make sure when you drop a file into the project that you are copying it into the actual project and not just linking to an external file.
  • Committing the files to version control. When you are committing you will probably also want to commit the project file itself, you may this file if you aren't looking for it but it should show up in the SCM section. This file contains the list of all the files that are actually in the project, so if you don't commit this then the files will all be under version control but when you checkout the project on a fresh machine the files won't be included in the project, probably leaving the project with breakages.
  • New files that are placed in the project will not be added to version control even if the parent folder is already under version control and the entire project was checked out from an SCM such as SVN originally.
  • When using XCode to check out from a repository - you probably then want to tell XCode that the project is from that repository.
  • There is no way to get a graphical project level diff within Xcode, such as the syncronization view in Eclipse which helps you to merge and rollback changes prior to doing a commit. File level diffs can be done and to get an idea what files in the project have changes you can right-click the groups and files header on the left hand navigation panel and select SCM, this put a letter next to files depending on their state.
  • You may want to exclude your user files inside foo.xcodeproj/blah.pbxuser as this has user specific settings
Saturday
Dec082007

Resizing a Fedora 8 Vmware Fusion image

This is how I managed to resize a VMware image for Fedora 8.

I started with a VMware Fedora 8 from thoughtpolice and a copy of VMWare Fusion.

Step 1 Resize the virtual disk

Under OS X run:
/Applications/VMware Fusion.app/Contents/MacOSdiskTool -X 40Gb fedora-8-i386.vmdk

(where fedora-8-i386.vmdk is the VMWare disk image file)

Step 2 Create an Ext3 partition in the free space.

Download the latest GParted ISO.Connect it to Vmware Fusion (VirtualMachine | CD/DVD | Choose Disk Image)

Restart Vmware but during the initial boot go into the virtual bios settings and change the boot order to make CD boot before hard disk.

Restart VMware to boot up GParted. (You can not resize the partition as it shows as unknown - but do not cry at this point!)
Select the free space and turn it into an ext3 partition.

Shutdown and disconnect the Virtual GParted CD Image.

Step 3 Adding the new partition into the Logical Volume.

I booted up into fedora to change the Logical Volume size - this may not be the best way to do it but it worked for me. then as root i executed the following.

pvcreate /dev/sda3

vgextend VolGroup00 /dev/sda3
vgdisplay -v VolGroup00

Somewhere it will display a number for Free PE on /dev/sda3

Now enter
lvextend -l xxxx /dev/VolGroup00/LogVol00

where xxxx is the Free PE number.

Finally the last step is a bit scary it resizes the partition to make use of the larger volume whilst the filesystem is mounted. But it did seem to work for me!

resize2fs /dev/VolGroup00/LogVol00

After a while the result should show something like:

resize2fs 1.40.2 (12-Jul-2007)
Filesystem at /dev/VolGroup00/LogVol00 is mounted on /; on-line resizing required
old desc_blocks = 1, new_desc_blocks = 2
Performing an on-line resize of /dev/VolGroup00/LogVol00 to 8380416 (4k) blocks.
The filesystem on /dev/VolGroup00/LogVol00 is now 8380416 blocks long.

And bingo my disk size has gone up from 7gig to 30Gig

[root@localhost dev]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
31G 3.1G 27G 11% /
/dev/sda1 190M 13M 169M 7% /boot
tmpfs 125M 12K 125M 1% /dev/shm

“In the middle of difficulty lies opportunity.”

Wednesday
Jun062007

Why Jotspot is not a good wiki

Let's start off with the basic principals of a wiki. The original C2 wiki covers the basics of why wikis work pretty well. Of particular note is the following:

"Wiki is not WysiWyg. It's an intelligence test of sorts to be able to edit a wiki page. It's not rocket science, but it doesn't appeal to the VideoAddicts. If it doesn't appeal, they don't participate, which leaves those of us who read and write to get on with rational discourse."

So when I tell you that jotspot is a so called WYSIWYG wiki you would be right to be suspicious! After all ,we all know the kind of crap that WYSIWYG generators generally produce (see MS Frontpage!). But I was willing to give Jotspot a go. Well, here are some of the improvements jotspot has made to the concept of the wiki:

  • Quick and simple to create new page.
    • Jotspot improves this experience by assuming you perhaps want to create a spreadsheet every time you try to create a new page.
  • Wiki-syntax is not WYSIWYG
    • Jotspot improves on this experience by defaulting you to WYSIWYG rather than wiki syntax this enables lazy users to mess up your wiki markup very quickly.
    • Better still, using the default WYSIWYG often actually breaks the pages so they can no longer be edited in normal markup mode. Ever. Genius.
  • Original wikis used a simple textarea for input.
    • Jotspot improves on this with Web2.0 goodies such as pages that keep refreshing for no reason and sucking all your CPU.
    • Jotspot also uses a javascript pulldown menu instead of a button, just to edit in markup mode! This is both annoying and broken in some browsers.
    • The usual form page doesn't appear in the history, so no navigating back and fore. Instead when you click back you get an annoying javascript popup telling you maybe you didn't want to move off the page because you will lose everything. Which indeed you will.

Finally I'd just like to share some of my favourite random jotspot errors:
Additional information:
com.s3.script.lib.JotLibException: function filterBadHTML: parameter html: no value specified.

After which jotspot becomes unuseable