with 'open source' tag

what is fair?

Distribution Service

Linux has been written entirely by volunteers who have been working on their own time, and I don't think that should change. I also don't think it's fair that someone take what has been written for free by people and try and sell it to turn a buck (i.e. make a living doing so). How fair is that to those of us who contribute our time freely?

i am not sure when it shifted, if it was gradual or sudden, but i do not agree with what i wrote back then (1992!) about the fairness of someone building a business on what someone else has given freely. that has always been a central tension in what became known as the open source community, and you even see it coming up in current discussions about the training data used by generative “AI” systems.

(also, another quote: “I cannot see Linux being a full-time thing for anyone at this point, really.” oops!)

cranking along

a few weeks ago, i got it in my head that the ideal time to switch to using scat, the web-based point-of-sale system that i have allegedly been working on for almost 18 months, would be after the new year. this is despite the fact that it was barely even a rough prototype of some ideas. but i’ve been cranking along on it over the last couple of weeks, and it looks like i might just have something that we can use by my arbitrary deadline.

it’s still just barely a rough prototype of ideas, but i there should be enough there on the surface for us to be able to use it, and needing to fill in the gaps because we are actually using it will probably provide plenty of motivation to make it better.

even in the very rough state it is in, it should alleviate some of the pains of our current system. and save us the $30/month we were paying for not-very-helpful support and slow-to-arrive upgrades (trading that for my time to support the new system, of course).

banker’s round for mysql

for some reason, nobody has ever exposed the different rounding methods via mysql’s built-in ROUND() function, so if you want something different, you need to add it via a stored function. the function below is based on the T-SQL version here.

CREATE FUNCTION ROUND_TO_EVEN(val DECIMAL(32,16), places INT)
RETURNS DECIMAL(32,16)
BEGIN
  RETURN IF(ABS(val - TRUNCATE(val, places)) * POWER(10, places + 1) = 5 
            AND NOT CONVERT(TRUNCATE(ABS(val) * POWER(10, places), 0),
                            UNSIGNED) % 2 = 1,
            TRUNCATE(val, places), ROUND(val, places));
END;

use at your own risk. there may be edge conditions where this fails. but this matches up with the python and postgres based system i was crunching data from, except in cases where that system gets it wrong for some reason.

one thing you might notice is that it does not use any string-handling functions like the other “correct” solution floating around out there.

building an order

Scat screenshot

it feels like i’ve been thinking about this point-of-sale thing long enough that when i find time to sit down and write some code, the pieces actually fall together pretty quickly. today i was able to rough together an order-building interface with a few bells and whistles (literally: it has audible cues).

still a long way to go, but this should be a useful little toy to let us take advantage of a spare computer and barcode scanner to more easily price items from incoming shipments and get them out on the shelves.

one of the things that this screenshot shows is what happens when a scan or search matches multiple different items — it adds a (crudely presented) list of the possible matches, and clicking on any one of them adds that item to the order.

three kinds of people

progress on scat continues to be slow, because i have not found a lot of time to work on it. but every week i have to deal with processing our weekly in-take of products with our current point-of-sale system, i kick myself a little more and get motivated to spend a little more time on it.

the big addition today was a table for people, which is pretty straightforward. our current system divides people up into three types (and stores them in the same table as products, thanks to a normalization scheme i have not carried over into scat), and currently i don’t make any such distinction because sometimes customers can become employees and vendors can be customers and we have few enough of all three that keeping them distinct doesn’t seem worth the extra complexity.

scattered progress

scat is the name of the web-based point-of-sale system that i’ve been working on. you might have guessed this if you had been paying attention to the tags on my earlier posts. you can also find the source code for scat on github. there is not much to see, as i am still tinkering and throwing code together to test ideas out.

the progress so far is that i can load all of the item data over from our checkout data, and search those items. one of the most painful things for us right now is receiving orders, so i have cobbled together the start of that functionality to use. we will be able to receive the order using this as we unpack the order, and then go back to checkout and receive the order there all at once through its stock room interface.

(the reason that receiving orders through checkout is so painful for us is that using the stock room interface cripples the performance of checkout. and since we are just using checkout on a single computer, it makes it hard to receive orders while we also want to serve customers.)

this is starting to get fun.

stuff in, stuff out

i know that i said that inventory is next, but i’m not sure that it really is, or at least not in terms of thinking of having an inventory that we add items into and out of. maybe what we really have is a collection of transactions that in their aggregate can be used to describe the inventory.

as i see it, there are three types of transactions:

  • vendor transactions: we put together a purchase order, we receive items (which may be more or less than what is on the purchase order and may not happen all at once), and we return items.
  • customer transactions: customers order items, we “deliver” items, and customers return items.
  • internal transactions: items are damaged, defective or stolen, and we take items for our own use.

so we’ll need a basic table for tracking these transactions (which i will abbreviate to txn because i am lazy):

CREATE TABLE `txn` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `number` int(10) unsigned NOT NULL,
  `created` datetime NOT NULL,
  `type` enum('internal','vendor','customer') NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `type` (`type`,`number`)
)

i am not thrilled with the number field, but we need some sort of user-visible number for printing on invoices, receipts, etc. consider this a placeholder for a better idea.

and for each transaction, we will have lines of items involved:

CREATE TABLE `txn_line` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `txn` int(10) unsigned NOT NULL,
  `line` int(10) unsigned NOT NULL,
  `item` int(10) unsigned DEFAULT NULL,
  `ordered` int(10) unsigned NOT NULL,
  `allocated` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  KEY `txn` (`txn`,`line`),
  KEY `item` (`item`)
);

these tables are both very incomplete — no prices are being tracked here yet, among other things. but this is enough for me to start playing with loading in data and building interfaces to it.

more pieces of the puzzle

back to noodling around with the item table. i think i am going to try and be a bit less deliberative with all of this, since i clearly don’t have a lot of spare time to spend on this and need to build some momentum.

i won’t get into tracking inventory yet, but a basic quality of an item i want to track is a minimum quantity to have on hand. i guess in an ideal system, these minimum quantities would be dynamic and driven by actual sales data, but for now we’ll be hand-tuning this number for items.

an item gets called by its name, so we’ll need a field for that.

so here is our final rough draft of the item table:

CREATE TABLE `item` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `code` varchar(255) NOT NULL,
  `name` varchar(255) NOT NULL,
  `brand` int(10) unsigned DEFAULT NULL,
  `retail_price` decimal(9,2) NOT NULL,
  `discount_type` enum('percentage','relative','fixed') DEFAULT NULL,
  `discount` decimal(9,2) DEFAULT NULL,
  `minimum_quantity` int(10) unsigned NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `code` (`code`)
)

and you’ll notice that i snuck in a brand column there, which will be our link over to another table, very basic for now (and maybe for good):

CREATE TABLE `brand` (
  `id` int(10) unsigned NOT NULL,
  `name` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
)

we have a nice barcode scanner that we’d like to keep using, so we’ll need to have barcodes we can relate to items. but some items have more than one barcode (common with books that have an ISBN and a UPC), and sometimes things come in packages of one quantity of items that can be decomposed into individual items with different barcodes. so barcodes will live in their own table, and each code will identify an item and a quantity:

CREATE TABLE `barcode` (
  `code` varchar(255) NOT NULL,
  `item` int(10) unsigned NOT NULL,
  `quantity` int(10) unsigned NOT NULL DEFAULT '1',
  PRIMARY KEY (`code`),
  KEY `item` (`item`)
)

we’ll want some more categorization later, but it’s not critical yet. inventory is next, but i am going to have to sleep on it.

no identity crisis

i slipped in a bit to the item table in my nascent point-of-sale system that i introduced last time that i didn’t explain at all. it’s just a little thing, a column called id that is an auto-incrementing integer. we need a way to uniquely identify items, and that’s the fallback method for this broken-down php and mysql coder.*

on the other hand, dealing with ringing up customers and putting together orders from distributors, my experience has been that it is good to have a short-hand identifier for products that is not totally opaque like a bare number. you can see that the developers of php point of sale came to the same conclusion by their inclusion of a item_number field (which is not a number, but we won’t hold that against them). the point-of-sale system we are using currently has a unique identifier for items that they call the code, and the underlying numeric identifiers in the database are never actually exposed in the interface.

the codes we use to identify products are borrowed almost entirely from the way that our primary distributor identifies products. each code has a two letter prefix that identifies the brand of the product, and then the rest of the identifier is structured differently depending on the brand. another of our distributors uses a fairly similar system with the three-letter prefix separated by a dash. depending on the brand and product, this means that looking up similar products can be straightforward if you just remember a part of the code. for example, i have it baked into my brain that all art alternatives studio canvases have a code starting with 'AA55', so doing searches or reports on just those items means i can just type in that prefix instead of having to navigate a more complicated category system. not all of the brands have codes that are structured that conveniently — products from 3M, for example, have a prefix of 'MT' but the rest of the code is based on a portion of the UPC, and a line like all of the command hooks & clips doesn’t sit within the same numeric range so there’s no one prefix that will come up with just those.

another interesting thing to consider is that an identification scheme based on the brand isn’t stable. not too long ago, chartpak acquired the higgins brand from sanford, which meant in the language of the codes that our distributor uses (and we use), the prefix on the higgins items changed from 'SA' to 'CH'. how we track those sort of changes is something we’ll have to consider later, but it does demonstrate that relying on this code as our primary identifier would be unwise.

but i think the real bottom line is that these identifiers are just a unique opaque identifier for the users of the point-of-sale system, so the system doesn’t need to impose any structure on them. in fact, i’m not sure if i can come up with a reason why they shouldn’t be optional, so i’ve left it open to an item not having a code.

so here is the updated table with our newly-minted code field:

CREATE TABLE item (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  code VARCHAR(255),
  retail_price DECIMAL(9,2) NOT NULL,
  discount_type ENUM('percentage','relative','fixed'),
  discount DECIMAL(9,2),
  PRIMARY KEY(id),
  UNIQUE (code)
);

* so is an auto-incrementing integer really the best primary key to use? it may seem a little more grown-up to use something like a uuid, but while these identifiers may be intended to be hidden, as someone who will almost certainly be looking behind the curtain to run queries against these tables manually, relatively small integers are a whole lot easier to deal with than big hexadecimal ones.

the prices need to be right

i wasn’t entirely truthful when i said i wasn’t sure where to start when writing a point-of-sale system. clearly the place to start is with a model of the data you are going to be handling, and because we are retail store dealing mostly with items out of inventory, describing an item is probably the place to start with that.

php point of sale has a pretty simple item table:

CREATE TABLE `phppos_items` (
  `name` varchar(255) NOT NULL,
  `category` varchar(255) NOT NULL,
  `supplier_id` int(11) DEFAULT NULL,
  `item_number` varchar(255) DEFAULT NULL,
  `description` varchar(255) NOT NULL,
  `cost_price` double(15,2) NOT NULL,
  `unit_price` double(15,2) NOT NULL,
  `quantity` int(10) NOT NULL DEFAULT '0',
  `reorder_level` int(10) NOT NULL DEFAULT '0',
  `item_id` int(10) NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`item_id`),
  UNIQUE KEY `item_number` (`item_number`),
  KEY `phppos_items_ibfk_1` (`supplier_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

it’s not a bad start, but it is very limited. the net cost of an item (cost_price in the table) is not a constant. prices change, different suppliers may charge different prices for the same item, and suppliers often have special deals based on quantity or time. and yes, suppliers with an 's', because we can get many items through more than one supplier.

even just two prices aren’t really enough: most of the items have net prices they are available to us at (depending on supplier and specials), the net price we actually paid for items in inventory, a retail price (also known as msrp), our every-day price (often a fixed percentage discount from msrp), limited-time sale prices, and even discounts based on a quantity of related items being purchased (buy six cans of spray paint, get them all at 25% off instead of 20% off). there’s also the price that someone actually paid for an item when they purchase it, which is usually derived from one of those others but could also be something that we further change or discount for a particular transaction. clearly, two fields in one table doesn’t quite capture this complexity.

if i were to really boil down the pricing in a primary item table, i think the only values that would be necessary are the retail price and our every-day price (expressed as a fixed price, relative price, or discount). even that retail price could arguably draw from the data that our suppliers provide, but we don’t always roll out changes to the suggested retail price at the same time our suppliers may update the pricing, and suppliers may not always agree on what the suggested retail price may be. so here’s my item table so far:

CREATE TABLE item (
  id INT UNSIGNED NOT NULL AUTO_INCREMENT,
  retail_price DECIMAL(9,2) NOT NULL,
  discount_type ENUM('percentage','relative','fixed'),
  discount DECIMAL(9,2),
  PRIMARY KEY(id)
);

obviously i haven’t yet captured all of the complexity that i’ve outlined above, but i’ll get there eventually.

piece of what?

i spend a lot of my day now dealing with a point-of-sale system that bothers me for the same reason that most software bothers me: it is broken and i can’t fix it. in this case, i can’t fix it because it is a closed-source application. one redeeming feature of the software is that it uses postgres as its back-end database, so it is relatively straightforward to get at the raw data and i’m not entirely hobbled by the slow, incomplete interfaces that the software itself offers. (instead i’m just hobbled by its baroque and undocumented schema and an inability to change or add to the data.)

so i have been poking around at the scant open source point-of-sale solutions, and they all generally look terrible, are complicated in directions that i don’t need complication, or are written in stupid languages like java.

php point of sale is way too simplistic, but it has helped me think about how i would (and likely will) build a point-of-sale system. unfortunately, i still haven’t figured out where to start.

so my hope is that if i start writing about it, i will find an entry point and i can eventually start building.

what is 10% of php worth?

i am listed as one of the ten members of the php group. most of the php source code says it is copyright “the php group” (except for the zend engine stuff). the much-debated contributor license agreement for PDO2 involves the php group.

could i assign whatever rights (and responsibilities) my membership in the php group represents to someone else? how much should i try to get for it? i mean, if mysql was worth $1 billion....

i am still disappointed that a way of evolving the membership of the php group was never established.

being known for being you

mike kruckenberg shared his observations from watching mysql source code commits, and jay pipes commented about this commit from antony curtis which had him excited. now that’s how open source is supposed to work, at least in part.

i replied to a later version of that commit to our internal developer list (and antony), pointing out that with just a little effort the comment would be more useful to people outside of the development team. “plugin server variables” doesn’t really do it justice, and “WL 2936” is useful to people who can access our internal task tracking tool, but does no good to people like mike.

the other reason it is good to engage the community like this is because it is very healthy for your own future. being able to point to the work i had done on open source and the networking that came from that have both been key factors in getting jobs for me. i’m sure it will be useful next time i am looking, too.

producing open source software by karl fogel (hardcopy) looks to be a very good book about the human side of producing open-source software.

blogging.la follows up on the old proposal for the city of los angeles to adopt open source and apply the money saved to hiring more police officers, and finds that the proposal appears to have gone off the rails. i wish i could say i was surprised.

but something that comes out in the report filed about current open source usage by the city is there are several departments using mysql, including the city ethics commission, and several others that think they could use it. cool.

democratizing development

democratizing innovation by eric von hippel is a fairly dry, academic business book, which made it tougher than i had expected to get through. there are some interesting observations and insights in the book, but they are perhaps too few and far between. you can read the book online.

over on planet mysql, the related topic of distributed version control has gotten some attention, with some shout-outs to free tools for doing distributed development. (i’ll add one for mercurial.)

ian bicking tries to argue in favor of centralized scm systems, but i think he’s neglecting the cost imposed on the center of the project by such centralized systems that the distributed systems do a really good job of distributing — you can impose something even better than his proposed “we don’t accept patches, we only accept pointers to branches in our repository” — “we don’t accept patches, we only pull changes from publically-available repositories.”

i can’t imagine the security nightmare of providing global check-in access to everyone, and the complexity of tools that would be required to manage the layers of dead-end branches.

left hand, meet the right hand

james gosling, who inflicted java on the world, had some interesting things to say about open-source companies:

Open source vendors also came under fire, with Gosling sideswiping MySQL, JBoss, and Red Hat: “They say that they are running their businesses based on services.

“These businesses are more hype than reality. If they don’t have a [longer term] economic model…they are going to have a really hard time.”

apparently he didn’t get the memo.

but i guess if anyone would be an expert on companies that lack financial viability, it would be an executive at sun.

i also love the other juxtaposition in the article: mysql isn’t open source because “no one is allowed to do check-ins,” but java is open-source, because the source has always been available.

city of angels to adopt open source?

a few los angeles city councilmembers have introduced a measure to have the city study using open-source software, and putting the possible money saved towards hiring new police officers. it sounds like a great plan, and i hope to get around to writing my city councilmember soon to encourage her to support the motion.

speaking of my city councilmember, i have gotten four calls from her campaign in the last few days. one of them was actually from the councilmember herself (before this open-source motion came up) due to some sort of mix-up by her campaign staff that led her to believe i had some issue i wanted to discuss. as i was sucking on the world of warcrack pipe at the time, i was in no mood to talk to her. then today was call number four, and i pointed out to the caller that if they called me again, i would almost certainly not vote for her in the upcoming primary. (the only other call i’ve gotten is from the bernard parks mayoral campaign.)