I need help determining the correct mathematical/logical
algorithm for a specific problem.
We are developing a software application that tracks
sales and we have two kinds of sale records :
1. those that are stroed in our own database (lets call these "DB sales")
2. and those that come from a third party payment application notification.
(lets called these "3rd party sales").
We want to validate our own sales records (DB sales) with what is
actually recorded (3rd party sales).
Imagine a linear timeline drawn left to right with various dots along
the way, with DB sales above the line and 3rd party sales below the line.
We want to match up DB sales to 3rd party sales.
Assume some DB sales OR 3rd party sales could
either be missing or be present but bogus.
Obviously we want the best match (closest in time)...
But in the real world there may be delays or errors
in the timestamp, especially from 3rd party.
The tricky part comes if a sale of one type falls roughly
in the middle of the timeline between 2 of the other
type of sale - which one do we match up?
There needs to be a period of time beyond which
the match makes no sense and won't be considered...
And things may also be complicated if we have
for example 6 out of 7 sales match during a day
so we're ready to ignore one, but then another
ones comes in... now we could match it, but
it might be invalidated later. For instance,
we have a DB sale at 11:13 pm and no match,
then a third party record comes in at 12:03 am,
(50 minutes later) and we think its a match,
but then we see another DB sale come in
at 12:04 am, so now we have to match the
12:03 with the 12:04 sale and undo the link
between the 11:13 sale and the 12:03 sale.
But then, another 3rd party sale comes in
at 12:07, so now maybe we have to rematch
11:13 with 12:03, and match 12:04 with 12:07,etc.
Can we avoid going back through the entire
history of time of the account to get the best
What kind of algorithm do I need?
Other open jobs by this client
- Hourly – Web Developer