You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@datasketches.apache.org by "Bhowmick, Rima" <rb...@visa.com.INVALID> on 2023/05/17 13:32:32 UTC

Re: [E] Postgres HLL is very slow

Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://pgxn.org/dist/datasketches/> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
The short answer is no, we don't expect to see any difference in estimates.
However, there is always a chance of having some bugs. On the one hand,
version 1.3.0 is quite old, and there is a chance we fixed something since
then. I cannot remember, I would need to carefully go through release notes
for both PostgreSQL extension and core library. On the other hand, there is
a chance we introduced some bugs.
What do you have in that column? Is it a metric column with HLL sketches?
When you say "one column" do you mean there are other columns with HLL
sketches, and there is no problem with them? How do you compare? Is your
dev environment identical to prod and you compare dev to prod numbers? Is
there a way to check by doing a "brute-force" distinct count of the
underlying raw identifiers?

On Mon, Jul 10, 2023 at 9:27 AM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Hi Alexander,
>
> Thank you so much for your help. With your continuous support, we were
> able to install datasketches v1.6 in our dev environment. After the
> installation, we started testing our queries which use hll sketch.
>
> But while testing, we observed that the hll estimate for one column has
> dramatically reduced. In prod, we  have 71M unique elements for that
> column, but in dev it is now 65M. Our prod environment is still using the
> old version of datasketches which is V1.3. Is it expected to see this
> difference in newer version?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 22 June 2023 at 8:46 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> How is it working for you?
>
>
>
> On Wed, May 31, 2023 at 9:59 AM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> Did you check that /pgbin/mbi1d/14.x/lib/postgresql/datasketches.so is the
> new one?
>
>
>
> On Wed, May 31, 2023 at 9:28 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Yes Alexander, we did run make first.
>
> And it did create datasketches.so in the same folder.
>
>
>
> Then we ran make install and after than alter command in psql.
>
>
>
> But the same error persists.
>
>
>
> mbi1d01=# alter extension datasketches update to '1.6.0';
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Wednesday, 31 May 2023 at 9:37 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> In your steps above I don't see "make" before make install. The result of
> that should be datasketches.so in the current directory.
>
> And "make install" should deploy that datasketches.so to the right place.
> Check to see if that datasketches.so in the error message is the new one.
>
> And I hope that you are doing this on a test database.
>
>
>
>
>
> On Wed, May 31, 2023 at 8:38 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Yes Alexander, thanks a lot for responding!
>
>
>
> We did run
>
>    - “make clean”
>    - “make install”
>    - “alter extension datasketches update to '1.6.0'”
>
>
>
> But we are facing same error as below:
>
>
>
> mbi1d01=# alter extension datasketches update to '1.6.0';
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
> Here are the version information:
>
> *Postgres:* PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
> 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
>
> *Datasketches*: V1.3 , want to upgrade to V1.6
>
> *Boost*: 1.75.0
>
>
>
> Please guide us how can we get rid of this error and install the plugin
> successfully.
>
>
>
> *Note*: We are using data sketches HLL algorithm, for heavily loaded
> Postgres DBs, cannot drop the extension and reinstall because it might drop
> the column itself.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 30 May 2023 at 10:43 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> did you install the new datasketches.so (sudo make install) after you
> built it?
>
>
>
> On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> alter extension datasketches update to '1.6.0';
>
>
>
> On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> Please help us to fix the current issue we are facing to migrate from 1.3
> to 1.6 version.
>
>
>
> Error:
>
> mbi1d01=# create extension datasketches;
> ERROR:  extension "datasketches" already exists
>
> mbi1d01=# drop extension datasketches;
> ERROR:  cannot drop extension datasketches because other objects depend on
> it
> DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on
> type mbi.hll_sketch
>
>
>
> Question:
>
>  We already using v1.3. When we run create extension it throws an error
> that extension already exists. We cannot drop the extension as we get a
> warning that hll_sketch column will be dropped from tables. Is it possible
> to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
>
> We tried using below command to update the extension but we can get below
> error:
>
> mbi1d01=# alter extension datasketches update;
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Wednesday, 24 May 2023 at 6:45 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I managed to reproduce the problem. It seems to me that such an old GCC
> 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.
>
>
>
> On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> What is your PostgreSQL version?
>
>
>
> On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Here is the C++ version
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 9:39 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> What is your compiler?
>
> Try "c++ --version"
>
>
>
> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Just to add on here the versions used.
>
>
>
>    1. datasketches-1.6.0 with datasketches-cpp included.
>    2. boost_1_80_0
>
>
>
> here is folder structure:
>
>
>
> Error after “make install”
>
>
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Not able to install the latest version of datasketches.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 8:38 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hello All,
>
>
>
> Facing this error while installing the dataSketches 1.6 version and doing
> “make install”.
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Please guide!
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Monday, 22 May 2023 at 10:12 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> This is not an error. This is a warning.
>
>
>
> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hi Alexander,
>
>
>
> We are trying to upgrade datasketches-postgresSQL extension into one of
> our linux servers.
>
>
>
> We are getting this error:
>
>
>
> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
> src/aod_sketch_c_adapter.cpp
> In file included from
> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>                  from boost/boost/math/special_functions/beta.hpp:13,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
> language standard to use Boost.Math will be C++14 starting in July 2023
> (Boost 1.82 release)" [-Wcpp]
> #    warning "The minimum language standard to use Boost.Math will be
> C++14 starting in July 2023 (Boost 1.82 release)"
>       ^
> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>                  from boost/boost/math/special_functions/gamma.hpp:18,
>                  from boost/boost/math/special_functions/beta.hpp:15,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
> boost::math::tools::integer_scalar_type<long double, true>’:
> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double; uintmax_t = long unsigned int]’
> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double]’
>
>
>
> We are using boost_1_80_0 version, please let us know if you have any clue
> what could be wrong?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 19 May 2023 at 1:54 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
> Just download, unpack and make a link so that the "make" can find it.
>
> I am afraid I don't quite understand your question. I would suggest
> following the Readme and asking specific questions about what is not clear
> or what goes wrong.
>
>
>
> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Hi Alexander,

Thank you so much for your help. With your continuous support, we were able to install datasketches v1.6 in our dev environment. After the installation, we started testing our queries which use hll sketch.

But while testing, we observed that the hll estimate for one column has dramatically reduced. In prod, we  have 71M unique elements for that column, but in dev it is now 65M. Our prod environment is still using the old version of datasketches which is V1.3. Is it expected to see this difference in newer version?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Thursday, 22 June 2023 at 8:46 AM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

How is it working for you?

On Wed, May 31, 2023 at 9:59 AM Alexander Saydakov <sa...@yahooinc.com>> wrote:
Did you check that /pgbin/mbi1d/14.x/lib/postgresql/datasketches.so is the new one?

On Wed, May 31, 2023 at 9:28 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Yes Alexander, we did run make first.
And it did create datasketches.so in the same folder.

Then we ran make install and after than alter command in psql.

But the same error persists.

mbi1d01=# alter extension datasketches update to '1.6.0';
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Wednesday, 31 May 2023 at 9:37 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

In your steps above I don't see "make" before make install. The result of that should be datasketches.so in the current directory.
And "make install" should deploy that datasketches.so to the right place. Check to see if that datasketches.so in the error message is the new one.
And I hope that you are doing this on a test database.


On Wed, May 31, 2023 at 8:38 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Yes Alexander, thanks a lot for responding!

We did run

  *   “make clean”
  *   “make install”
  *   “alter extension datasketches update to '1.6.0'”

But we are facing same error as below:

mbi1d01=# alter extension datasketches update to '1.6.0';
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"

Here are the version information:
Postgres: PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
Datasketches: V1.3 , want to upgrade to V1.6
Boost: 1.75.0

Please guide us how can we get rid of this error and install the plugin successfully.

Note: We are using data sketches HLL algorithm, for heavily loaded Postgres DBs, cannot drop the extension and reinstall because it might drop the column itself.

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 30 May 2023 at 10:43 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

did you install the new datasketches.so (sudo make install) after you built it?

On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>> wrote:

alter extension datasketches update to '1.6.0';

On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello All,
Please help us to fix the current issue we are facing to migrate from 1.3 to 1.6 version.

Error:
mbi1d01=# create extension datasketches;
ERROR:  extension "datasketches" already exists
mbi1d01=# drop extension datasketches;
ERROR:  cannot drop extension datasketches because other objects depend on it
DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on type mbi.hll_sketch

Question:
 We already using v1.3. When we run create extension it throws an error that extension already exists. We cannot drop the extension as we get a warning that hll_sketch column will be dropped from tables. Is it possible to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
We tried using below command to update the extension but we can get below error:
mbi1d01=# alter extension datasketches update;
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"


Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Wednesday, 24 May 2023 at 6:45 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I managed to reproduce the problem. It seems to me that such an old GCC 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.

On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>> wrote:
What is your PostgreSQL version?

On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Here is the C++ version

[cid:image001.png@01D9B379.829492D0]

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 9:39 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

What is your compiler?
Try "c++ --version"

On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Just to add on here the versions used.


  1.  datasketches-1.6.0 with datasketches-cpp included.
  2.  boost_1_80_0

here is folder structure:
[cid:image002.png@01D9B379.829492D0]

Error after “make install”

src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Not able to install the latest version of datasketches.

Thanks,
Rima Bhowmick.

From: "Bhowmick, Rima" <rb...@visa.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 8:38 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hello All,

Facing this error while installing the dataSketches 1.6 version and doing “make install”.
src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Please guide!

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Monday, 22 May 2023 at 10:12 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
How is it working for you?

On Wed, May 31, 2023 at 9:59 AM Alexander Saydakov <sa...@yahooinc.com>
wrote:

> Did you check that /pgbin/mbi1d/14.x/lib/postgresql/datasketches.so is the
> new one?
>
> On Wed, May 31, 2023 at 9:28 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
>> Yes Alexander, we did run make first.
>>
>> And it did create datasketches.so in the same folder.
>>
>>
>>
>> Then we ran make install and after than alter command in psql.
>>
>>
>>
>> But the same error persists.
>>
>>
>>
>> mbi1d01=# alter extension datasketches update to '1.6.0';
>> ERROR:  could not find function "pg_theta_intersection_get_result" in
>> file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Wednesday, 31 May 2023 at 9:37 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> In your steps above I don't see "make" before make install. The result of
>> that should be datasketches.so in the current directory.
>>
>> And "make install" should deploy that datasketches.so to the right place.
>> Check to see if that datasketches.so in the error message is the new one.
>>
>> And I hope that you are doing this on a test database.
>>
>>
>>
>>
>>
>> On Wed, May 31, 2023 at 8:38 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Yes Alexander, thanks a lot for responding!
>>
>>
>>
>> We did run
>>
>>    - “make clean”
>>    - “make install”
>>    - “alter extension datasketches update to '1.6.0'”
>>
>>
>>
>> But we are facing same error as below:
>>
>>
>>
>> mbi1d01=# alter extension datasketches update to '1.6.0';
>> ERROR:  could not find function "pg_theta_intersection_get_result" in
>> file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>>
>>
>>
>> Here are the version information:
>>
>> *Postgres:* PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc
>> (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
>>
>> *Datasketches*: V1.3 , want to upgrade to V1.6
>>
>> *Boost*: 1.75.0
>>
>>
>>
>> Please guide us how can we get rid of this error and install the plugin
>> successfully.
>>
>>
>>
>> *Note*: We are using data sketches HLL algorithm, for heavily loaded
>> Postgres DBs, cannot drop the extension and reinstall because it might drop
>> the column itself.
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Tuesday, 30 May 2023 at 10:43 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> did you install the new datasketches.so (sudo make install) after you
>> built it?
>>
>>
>>
>> On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>
>> wrote:
>>
>> alter extension datasketches update to '1.6.0';
>>
>>
>>
>> On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello All,
>>
>> Please help us to fix the current issue we are facing to migrate from 1.3
>> to 1.6 version.
>>
>>
>>
>> Error:
>>
>> mbi1d01=# create extension datasketches;
>> ERROR:  extension "datasketches" already exists
>>
>> mbi1d01=# drop extension datasketches;
>> ERROR:  cannot drop extension datasketches because other objects depend
>> on it
>> DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on
>> type mbi.hll_sketch
>>
>>
>>
>> Question:
>>
>>  We already using v1.3. When we run create extension it throws an error
>> that extension already exists. We cannot drop the extension as we get a
>> warning that hll_sketch column will be dropped from tables. Is it possible
>> to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
>>
>> We tried using below command to update the extension but we can get below
>> error:
>>
>> mbi1d01=# alter extension datasketches update;
>> ERROR:  could not find function "pg_theta_intersection_get_result" in
>> file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Wednesday, 24 May 2023 at 6:45 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> I managed to reproduce the problem. It seems to me that such an old GCC
>> 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.
>>
>>
>>
>> On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>
>> wrote:
>>
>> What is your PostgreSQL version?
>>
>>
>>
>> On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Here is the C++ version
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Tuesday, 23 May 2023 at 9:39 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> What is your compiler?
>>
>> Try "c++ --version"
>>
>>
>>
>> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Just to add on here the versions used.
>>
>>
>>
>>    1. datasketches-1.6.0 with datasketches-cpp included.
>>    2. boost_1_80_0
>>
>>
>>
>> here is folder structure:
>>
>>
>>
>> Error after “make install”
>>
>>
>>
>> src/aod_sketch_c_adapter.cpp:310:56:   required from here
>> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
>> not a class, struct, or union type
>>        using value_type = typename T::value_type;
>>                                                 ^
>> make: *** [src/aod_sketch_c_adapter.o] Error 1
>>
>>
>>
>> Not able to install the latest version of datasketches.
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Tuesday, 23 May 2023 at 8:38 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Hello All,
>>
>>
>>
>> Facing this error while installing the dataSketches 1.6 version and doing
>> “make install”.
>>
>> src/aod_sketch_c_adapter.cpp:310:56:   required from here
>> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
>> not a class, struct, or union type
>>        using value_type = typename T::value_type;
>>                                                 ^
>> make: *** [src/aod_sketch_c_adapter.o] Error 1
>>
>>
>>
>> Please guide!
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Monday, 22 May 2023 at 10:12 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> This is not an error. This is a warning.
>>
>>
>>
>> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hi Alexander,
>>
>>
>>
>> We are trying to upgrade datasketches-postgresSQL extension into one of
>> our linux servers.
>>
>>
>>
>> We are getting this error:
>>
>>
>>
>> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
>> src/aod_sketch_c_adapter.cpp
>> In file included from
>> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>>                  from boost/boost/math/special_functions/beta.hpp:13,
>>                  from boost/boost/math/distributions/students_t.hpp:16,
>>                  from src/aod_sketch_c_adapter.cpp:34:
>> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
>> language standard to use Boost.Math will be C++14 starting in July 2023
>> (Boost 1.82 release)" [-Wcpp]
>> #    warning "The minimum language standard to use Boost.Math will be
>> C++14 starting in July 2023 (Boost 1.82 release)"
>>       ^
>> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>>                  from boost/boost/math/special_functions/gamma.hpp:18,
>>                  from boost/boost/math/special_functions/beta.hpp:15,
>>                  from boost/boost/math/distributions/students_t.hpp:16,
>>                  from src/aod_sketch_c_adapter.cpp:34:
>> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
>> boost::math::tools::integer_scalar_type<long double, true>’:
>> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
>> boost::math::tools::detail::fraction_traits<Gen>::result_type
>> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
>> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
>> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
>> long double; uintmax_t = long unsigned int]’
>> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
>> boost::math::tools::detail::fraction_traits<Gen>::result_type
>> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
>> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
>> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
>> long double]’
>>
>>
>>
>> We are using boost_1_80_0 version, please let us know if you have any
>> clue what could be wrong?
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Friday, 19 May 2023 at 1:54 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
>> Just download, unpack and make a link so that the "make" can find it.
>>
>> I am afraid I don't quite understand your question. I would suggest
>> following the Readme and asking specific questions about what is not clear
>> or what goes wrong.
>>
>>
>>
>> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>>
>> Today we downloaded the package *PGXN
>> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
>> website, is it mandatory to install the Boost package too?
>>
>> While installing 1.3 version of Postgres dataSketches plugin earlier, we
>> didn’t use Boost then.
>>
>>
>>
>> Also to install are the below steps are sufficient as mentioned in
>> documentation?
>>
>> *Building and installing*
>>
>>    - make
>>    - sudo make install
>>
>> Thanks in advance!
>>
>>
>>
>> Regards,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Thursday, 27 April 2023 at 1:25 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> The changes in question have been merged to the master branch.
>>
>> We have just started the release process for datasketches-cpp (version
>> 4.1.0). Once this is done, we will start the release process for
>> datasketches-postgress 1.6.0. In the meantime you may want to try the
>> latest code with the latest datasketches-cpp from the master branch.
>>
>>
>>
>> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>>
>> As noted in the linked issue, the postgresql 1.5 package is compatible
>> with the cpp 3.x line, not 4.x. It should work fine with the last
>> datasketches-cpp 3.x release.
>>
>>
>>
>> In the meantime, as noted, we are actively trying to work on speed
>> improvements for HLL as requested at the start of this thread.
>>
>>
>>
>> Additionally, one thing that can help speed releases is to vote whenever
>> there's a vote announcement -- even a non-binding vote is valuable!
>>
>>
>>
>>   jon
>>
>>
>>
>> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello All,
>>
>> We are trying to install new version of datasketches in our postgres
>> instance. I have downloaded datasketches-postgresql 1.5.0
>> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
>> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
>> 1.81.0. I have followed the same steps as mentioned in the readme file.
>> While executing the make command, I faced an error:
>>
>> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
>> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
>> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
>> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
>> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
>> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
>> -Idatasketches-cpp/req/include -I. -I./
>> -I/pgbin/mbi1d/12.x/include/postgresql/server
>> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
>> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
>> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
>> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of
>> template arguments (4, should be 3)
>> typedef datasketches::kll_sketch<float, std::less<float>,
>> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>>
>> ^
>> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
>> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
>> ‘template<class T, class C, class A> class datasketches::kll_sketch’
>> class kll_sketch {
>>
>> Looks like there is a mismatch of arguments in
>> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
>> Could you please suggest a solution. Thank you!
>>
>> https://github.com/apache/datasketches-postgresql/issues/62
>> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>>
>> *Datasketches Distinct count postgres extension algorithm is used in our
>> applications to get very prominent business value, therefor if we cannot
>> upgrade the versions, it would be a bigg loss for us.*
>>
>> *Could you please guide us what could be the best approach to overcome
>> this?*
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Saturday, 15 April 2023 at 12:05 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> I am not sure about the date. I think the development should take a few
>> days. A formal Apache release will take substantially more time just to go
>> through the required steps of voting for the core library release (not
>> really necessary for the parallel execution, but necessary to bring the
>> latest speed improvements into PostgreSQL extension), and then going
>> through the same procedure to release the extension.
>>
>> Of course, you don't have to wait for the formal release to start testing.
>>
>> Could you clarify your issues building the latest version please? I
>> believe that the datasketches-postgresql code in the master branch is
>> compatible with the latest datasketches-cpp code.
>>
>>
>>
>> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello Alexander,
>>
>>
>>
>> Do you have any date in mind, for releasing the same to have parallel
>> execution?
>>
>> Also we tried upgrading datasketches version from latest documentation,
>> we are getting lot of C++ version issues.
>>
>> Its very tough to install the new version. Any thoughts?
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Friday, 14 April 2023 at 10:58 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Hi Rima,
>>
>> I am working on the datasketches extension to support parallel queries
>> (distributed aggregation).
>>
>> I expect to get this done in a matter of days.
>>
>> Also we have just made some improvements to HLL merge speed in the core
>> library. These changes were not released yet, but available in the master
>> branch.
>>
>> We have another HLL performance improvement in mind. I will work on it
>> once I finish the parallel query support.
>>
>>
>>
>>
>>
>> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello Team,
>>
>>
>>
>> Here is the snapshot of the existing application:
>>
>>
>>
>> TechStack: Postgres DB, Hive, Tableau UI
>>
>> Postgres Plugin: DataSketches
>>
>>
>>
>> Flow in brief:
>>
>>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>>    datasketches algo) active card data, along with other details to Hive.
>>    - Another job populates that data to Postgres DB, finally having 3
>>    years data of 4 regions for multiple countries.
>>    - Tableau dashboard having live connection to Postgres DB.
>>    - Tableau Query calling Postgres DB, to aggregate the
>>    binary/pre-aggregated data to get distinct card count (using DataSketches
>>    algorithm) and fetch data based on multiple filter conditions.
>>    - Usually data would be of 3yrs for the span of 2 months, means total
>>    6 months of data to aggregate for a country on multiple conditions.
>>
>>
>>
>> Usually this aggregation query response is quite slow. We have tried lot
>> of different ways to resolve this,
>>
>>
>>
>> Mainly datasketches part is making most of the time in execution.
>>
>>
>>
>> Thanks & Regards,
>>
>> Rima Bhowmick
>>
>> Marketing Brand Analytics
>>
>> *Error! Filename not specified.*
>>
>>

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
Did you check that /pgbin/mbi1d/14.x/lib/postgresql/datasketches.so is the
new one?

On Wed, May 31, 2023 at 9:28 AM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Yes Alexander, we did run make first.
>
> And it did create datasketches.so in the same folder.
>
>
>
> Then we ran make install and after than alter command in psql.
>
>
>
> But the same error persists.
>
>
>
> mbi1d01=# alter extension datasketches update to '1.6.0';
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Wednesday, 31 May 2023 at 9:37 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> In your steps above I don't see "make" before make install. The result of
> that should be datasketches.so in the current directory.
>
> And "make install" should deploy that datasketches.so to the right place.
> Check to see if that datasketches.so in the error message is the new one.
>
> And I hope that you are doing this on a test database.
>
>
>
>
>
> On Wed, May 31, 2023 at 8:38 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Yes Alexander, thanks a lot for responding!
>
>
>
> We did run
>
>    - “make clean”
>    - “make install”
>    - “alter extension datasketches update to '1.6.0'”
>
>
>
> But we are facing same error as below:
>
>
>
> mbi1d01=# alter extension datasketches update to '1.6.0';
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
> Here are the version information:
>
> *Postgres:* PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
> 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
>
> *Datasketches*: V1.3 , want to upgrade to V1.6
>
> *Boost*: 1.75.0
>
>
>
> Please guide us how can we get rid of this error and install the plugin
> successfully.
>
>
>
> *Note*: We are using data sketches HLL algorithm, for heavily loaded
> Postgres DBs, cannot drop the extension and reinstall because it might drop
> the column itself.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 30 May 2023 at 10:43 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> did you install the new datasketches.so (sudo make install) after you
> built it?
>
>
>
> On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> alter extension datasketches update to '1.6.0';
>
>
>
> On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> Please help us to fix the current issue we are facing to migrate from 1.3
> to 1.6 version.
>
>
>
> Error:
>
> mbi1d01=# create extension datasketches;
> ERROR:  extension "datasketches" already exists
>
> mbi1d01=# drop extension datasketches;
> ERROR:  cannot drop extension datasketches because other objects depend on
> it
> DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on
> type mbi.hll_sketch
>
>
>
> Question:
>
>  We already using v1.3. When we run create extension it throws an error
> that extension already exists. We cannot drop the extension as we get a
> warning that hll_sketch column will be dropped from tables. Is it possible
> to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
>
> We tried using below command to update the extension but we can get below
> error:
>
> mbi1d01=# alter extension datasketches update;
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Wednesday, 24 May 2023 at 6:45 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I managed to reproduce the problem. It seems to me that such an old GCC
> 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.
>
>
>
> On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> What is your PostgreSQL version?
>
>
>
> On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Here is the C++ version
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 9:39 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> What is your compiler?
>
> Try "c++ --version"
>
>
>
> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Just to add on here the versions used.
>
>
>
>    1. datasketches-1.6.0 with datasketches-cpp included.
>    2. boost_1_80_0
>
>
>
> here is folder structure:
>
>
>
> Error after “make install”
>
>
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Not able to install the latest version of datasketches.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 8:38 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hello All,
>
>
>
> Facing this error while installing the dataSketches 1.6 version and doing
> “make install”.
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Please guide!
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Monday, 22 May 2023 at 10:12 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> This is not an error. This is a warning.
>
>
>
> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hi Alexander,
>
>
>
> We are trying to upgrade datasketches-postgresSQL extension into one of
> our linux servers.
>
>
>
> We are getting this error:
>
>
>
> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
> src/aod_sketch_c_adapter.cpp
> In file included from
> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>                  from boost/boost/math/special_functions/beta.hpp:13,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
> language standard to use Boost.Math will be C++14 starting in July 2023
> (Boost 1.82 release)" [-Wcpp]
> #    warning "The minimum language standard to use Boost.Math will be
> C++14 starting in July 2023 (Boost 1.82 release)"
>       ^
> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>                  from boost/boost/math/special_functions/gamma.hpp:18,
>                  from boost/boost/math/special_functions/beta.hpp:15,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
> boost::math::tools::integer_scalar_type<long double, true>’:
> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double; uintmax_t = long unsigned int]’
> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double]’
>
>
>
> We are using boost_1_80_0 version, please let us know if you have any clue
> what could be wrong?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 19 May 2023 at 1:54 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
> Just download, unpack and make a link so that the "make" can find it.
>
> I am afraid I don't quite understand your question. I would suggest
> following the Readme and asking specific questions about what is not clear
> or what goes wrong.
>
>
>
> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Yes Alexander, we did run make first.
And it did create datasketches.so in the same folder.

Then we ran make install and after than alter command in psql.

But the same error persists.

mbi1d01=# alter extension datasketches update to '1.6.0';
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Wednesday, 31 May 2023 at 9:37 PM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

In your steps above I don't see "make" before make install. The result of that should be datasketches.so in the current directory.
And "make install" should deploy that datasketches.so to the right place. Check to see if that datasketches.so in the error message is the new one.
And I hope that you are doing this on a test database.


On Wed, May 31, 2023 at 8:38 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Yes Alexander, thanks a lot for responding!

We did run

  *   “make clean”
  *   “make install”
  *   “alter extension datasketches update to '1.6.0'”

But we are facing same error as below:

mbi1d01=# alter extension datasketches update to '1.6.0';
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"

Here are the version information:
Postgres: PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
Datasketches: V1.3 , want to upgrade to V1.6
Boost: 1.75.0

Please guide us how can we get rid of this error and install the plugin successfully.

Note: We are using data sketches HLL algorithm, for heavily loaded Postgres DBs, cannot drop the extension and reinstall because it might drop the column itself.

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 30 May 2023 at 10:43 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

did you install the new datasketches.so (sudo make install) after you built it?

On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>> wrote:

alter extension datasketches update to '1.6.0';

On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello All,
Please help us to fix the current issue we are facing to migrate from 1.3 to 1.6 version.

Error:
mbi1d01=# create extension datasketches;
ERROR:  extension "datasketches" already exists
mbi1d01=# drop extension datasketches;
ERROR:  cannot drop extension datasketches because other objects depend on it
DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on type mbi.hll_sketch

Question:
 We already using v1.3. When we run create extension it throws an error that extension already exists. We cannot drop the extension as we get a warning that hll_sketch column will be dropped from tables. Is it possible to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
We tried using below command to update the extension but we can get below error:
mbi1d01=# alter extension datasketches update;
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"


Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Wednesday, 24 May 2023 at 6:45 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I managed to reproduce the problem. It seems to me that such an old GCC 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.

On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>> wrote:
What is your PostgreSQL version?

On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Here is the C++ version

[cid:image001.png@01D9940A.FF82B440]

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 9:39 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

What is your compiler?
Try "c++ --version"

On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Just to add on here the versions used.


  1.  datasketches-1.6.0 with datasketches-cpp included.
  2.  boost_1_80_0

here is folder structure:
[cid:image002.png@01D9940A.FF82B440]

Error after “make install”

src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Not able to install the latest version of datasketches.

Thanks,
Rima Bhowmick.

From: "Bhowmick, Rima" <rb...@visa.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 8:38 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hello All,

Facing this error while installing the dataSketches 1.6 version and doing “make install”.
src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Please guide!

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Monday, 22 May 2023 at 10:12 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
In your steps above I don't see "make" before make install. The result of
that should be datasketches.so in the current directory.
And "make install" should deploy that datasketches.so to the right place.
Check to see if that datasketches.so in the error message is the new one.
And I hope that you are doing this on a test database.


On Wed, May 31, 2023 at 8:38 AM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Yes Alexander, thanks a lot for responding!
>
>
>
> We did run
>
>    - “make clean”
>    - “make install”
>    - “alter extension datasketches update to '1.6.0'”
>
>
>
> But we are facing same error as below:
>
>
>
> mbi1d01=# alter extension datasketches update to '1.6.0';
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
> Here are the version information:
>
> *Postgres:* PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC)
> 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
>
> *Datasketches*: V1.3 , want to upgrade to V1.6
>
> *Boost*: 1.75.0
>
>
>
> Please guide us how can we get rid of this error and install the plugin
> successfully.
>
>
>
> *Note*: We are using data sketches HLL algorithm, for heavily loaded
> Postgres DBs, cannot drop the extension and reinstall because it might drop
> the column itself.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 30 May 2023 at 10:43 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> did you install the new datasketches.so (sudo make install) after you
> built it?
>
>
>
> On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> alter extension datasketches update to '1.6.0';
>
>
>
> On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> Please help us to fix the current issue we are facing to migrate from 1.3
> to 1.6 version.
>
>
>
> Error:
>
> mbi1d01=# create extension datasketches;
> ERROR:  extension "datasketches" already exists
>
> mbi1d01=# drop extension datasketches;
> ERROR:  cannot drop extension datasketches because other objects depend on
> it
> DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on
> type mbi.hll_sketch
>
>
>
> Question:
>
>  We already using v1.3. When we run create extension it throws an error
> that extension already exists. We cannot drop the extension as we get a
> warning that hll_sketch column will be dropped from tables. Is it possible
> to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
>
> We tried using below command to update the extension but we can get below
> error:
>
> mbi1d01=# alter extension datasketches update;
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Wednesday, 24 May 2023 at 6:45 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I managed to reproduce the problem. It seems to me that such an old GCC
> 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.
>
>
>
> On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> What is your PostgreSQL version?
>
>
>
> On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Here is the C++ version
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 9:39 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> What is your compiler?
>
> Try "c++ --version"
>
>
>
> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Just to add on here the versions used.
>
>
>
>    1. datasketches-1.6.0 with datasketches-cpp included.
>    2. boost_1_80_0
>
>
>
> here is folder structure:
>
>
>
> Error after “make install”
>
>
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Not able to install the latest version of datasketches.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 8:38 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hello All,
>
>
>
> Facing this error while installing the dataSketches 1.6 version and doing
> “make install”.
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Please guide!
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Monday, 22 May 2023 at 10:12 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> This is not an error. This is a warning.
>
>
>
> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hi Alexander,
>
>
>
> We are trying to upgrade datasketches-postgresSQL extension into one of
> our linux servers.
>
>
>
> We are getting this error:
>
>
>
> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
> src/aod_sketch_c_adapter.cpp
> In file included from
> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>                  from boost/boost/math/special_functions/beta.hpp:13,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
> language standard to use Boost.Math will be C++14 starting in July 2023
> (Boost 1.82 release)" [-Wcpp]
> #    warning "The minimum language standard to use Boost.Math will be
> C++14 starting in July 2023 (Boost 1.82 release)"
>       ^
> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>                  from boost/boost/math/special_functions/gamma.hpp:18,
>                  from boost/boost/math/special_functions/beta.hpp:15,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
> boost::math::tools::integer_scalar_type<long double, true>’:
> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double; uintmax_t = long unsigned int]’
> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double]’
>
>
>
> We are using boost_1_80_0 version, please let us know if you have any clue
> what could be wrong?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 19 May 2023 at 1:54 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
> Just download, unpack and make a link so that the "make" can find it.
>
> I am afraid I don't quite understand your question. I would suggest
> following the Readme and asking specific questions about what is not clear
> or what goes wrong.
>
>
>
> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Yes Alexander, thanks a lot for responding!

We did run

  *   “make clean”
  *   “make install”
  *   “alter extension datasketches update to '1.6.0'”

But we are facing same error as below:

mbi1d01=# alter extension datasketches update to '1.6.0';
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"

Here are the version information:
Postgres: PostgreSQL 14.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
Datasketches: V1.3 , want to upgrade to V1.6
Boost: 1.75.0

Please guide us how can we get rid of this error and install the plugin successfully.

Note: We are using data sketches HLL algorithm, for heavily loaded Postgres DBs, cannot drop the extension and reinstall because it might drop the column itself.

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Tuesday, 30 May 2023 at 10:43 PM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

did you install the new datasketches.so (sudo make install) after you built it?

On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>> wrote:

alter extension datasketches update to '1.6.0';

On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello All,
Please help us to fix the current issue we are facing to migrate from 1.3 to 1.6 version.

Error:
mbi1d01=# create extension datasketches;
ERROR:  extension "datasketches" already exists
mbi1d01=# drop extension datasketches;
ERROR:  cannot drop extension datasketches because other objects depend on it
DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on type mbi.hll_sketch

Question:
 We already using v1.3. When we run create extension it throws an error that extension already exists. We cannot drop the extension as we get a warning that hll_sketch column will be dropped from tables. Is it possible to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
We tried using below command to update the extension but we can get below error:
mbi1d01=# alter extension datasketches update;
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"


Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Wednesday, 24 May 2023 at 6:45 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I managed to reproduce the problem. It seems to me that such an old GCC 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.

On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>> wrote:
What is your PostgreSQL version?

On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Here is the C++ version

[cid:image001.png@01D99403.ED8C06D0]

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 9:39 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

What is your compiler?
Try "c++ --version"

On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Just to add on here the versions used.


  1.  datasketches-1.6.0 with datasketches-cpp included.
  2.  boost_1_80_0

here is folder structure:
[cid:image002.png@01D99403.ED8C06D0]

Error after “make install”

src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Not able to install the latest version of datasketches.

Thanks,
Rima Bhowmick.

From: "Bhowmick, Rima" <rb...@visa.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 8:38 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hello All,

Facing this error while installing the dataSketches 1.6 version and doing “make install”.
src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Please guide!

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Monday, 22 May 2023 at 10:12 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
did you install the new datasketches.so (sudo make install) after you built
it?

On Tue, May 30, 2023 at 9:56 AM Alexander Saydakov <sa...@yahooinc.com>
wrote:

> alter extension datasketches update to '1.6.0';
>
> On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
>> Hello All,
>>
>> Please help us to fix the current issue we are facing to migrate from 1.3
>> to 1.6 version.
>>
>>
>>
>> Error:
>>
>> mbi1d01=# create extension datasketches;
>> ERROR:  extension "datasketches" already exists
>>
>> mbi1d01=# drop extension datasketches;
>> ERROR:  cannot drop extension datasketches because other objects depend
>> on it
>> DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on
>> type mbi.hll_sketch
>>
>>
>>
>> Question:
>>
>>  We already using v1.3. When we run create extension it throws an error
>> that extension already exists. We cannot drop the extension as we get a
>> warning that hll_sketch column will be dropped from tables. Is it possible
>> to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
>>
>> We tried using below command to update the extension but we can get below
>> error:
>>
>> mbi1d01=# alter extension datasketches update;
>> ERROR:  could not find function "pg_theta_intersection_get_result" in
>> file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Wednesday, 24 May 2023 at 6:45 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> I managed to reproduce the problem. It seems to me that such an old GCC
>> 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.
>>
>>
>>
>> On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>
>> wrote:
>>
>> What is your PostgreSQL version?
>>
>>
>>
>> On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Here is the C++ version
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Tuesday, 23 May 2023 at 9:39 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> What is your compiler?
>>
>> Try "c++ --version"
>>
>>
>>
>> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Just to add on here the versions used.
>>
>>
>>
>>    1. datasketches-1.6.0 with datasketches-cpp included.
>>    2. boost_1_80_0
>>
>>
>>
>> here is folder structure:
>>
>>
>>
>> Error after “make install”
>>
>>
>>
>> src/aod_sketch_c_adapter.cpp:310:56:   required from here
>> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
>> not a class, struct, or union type
>>        using value_type = typename T::value_type;
>>                                                 ^
>> make: *** [src/aod_sketch_c_adapter.o] Error 1
>>
>>
>>
>> Not able to install the latest version of datasketches.
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Tuesday, 23 May 2023 at 8:38 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Hello All,
>>
>>
>>
>> Facing this error while installing the dataSketches 1.6 version and doing
>> “make install”.
>>
>> src/aod_sketch_c_adapter.cpp:310:56:   required from here
>> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
>> not a class, struct, or union type
>>        using value_type = typename T::value_type;
>>                                                 ^
>> make: *** [src/aod_sketch_c_adapter.o] Error 1
>>
>>
>>
>> Please guide!
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Monday, 22 May 2023 at 10:12 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> This is not an error. This is a warning.
>>
>>
>>
>> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hi Alexander,
>>
>>
>>
>> We are trying to upgrade datasketches-postgresSQL extension into one of
>> our linux servers.
>>
>>
>>
>> We are getting this error:
>>
>>
>>
>> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
>> src/aod_sketch_c_adapter.cpp
>> In file included from
>> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>>                  from boost/boost/math/special_functions/beta.hpp:13,
>>                  from boost/boost/math/distributions/students_t.hpp:16,
>>                  from src/aod_sketch_c_adapter.cpp:34:
>> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
>> language standard to use Boost.Math will be C++14 starting in July 2023
>> (Boost 1.82 release)" [-Wcpp]
>> #    warning "The minimum language standard to use Boost.Math will be
>> C++14 starting in July 2023 (Boost 1.82 release)"
>>       ^
>> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>>                  from boost/boost/math/special_functions/gamma.hpp:18,
>>                  from boost/boost/math/special_functions/beta.hpp:15,
>>                  from boost/boost/math/distributions/students_t.hpp:16,
>>                  from src/aod_sketch_c_adapter.cpp:34:
>> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
>> boost::math::tools::integer_scalar_type<long double, true>’:
>> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
>> boost::math::tools::detail::fraction_traits<Gen>::result_type
>> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
>> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
>> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
>> long double; uintmax_t = long unsigned int]’
>> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
>> boost::math::tools::detail::fraction_traits<Gen>::result_type
>> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
>> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
>> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
>> long double]’
>>
>>
>>
>> We are using boost_1_80_0 version, please let us know if you have any
>> clue what could be wrong?
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Friday, 19 May 2023 at 1:54 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
>> Just download, unpack and make a link so that the "make" can find it.
>>
>> I am afraid I don't quite understand your question. I would suggest
>> following the Readme and asking specific questions about what is not clear
>> or what goes wrong.
>>
>>
>>
>> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>>
>> Today we downloaded the package *PGXN
>> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
>> website, is it mandatory to install the Boost package too?
>>
>> While installing 1.3 version of Postgres dataSketches plugin earlier, we
>> didn’t use Boost then.
>>
>>
>>
>> Also to install are the below steps are sufficient as mentioned in
>> documentation?
>>
>> *Building and installing*
>>
>>    - make
>>    - sudo make install
>>
>> Thanks in advance!
>>
>>
>>
>> Regards,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Thursday, 27 April 2023 at 1:25 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> The changes in question have been merged to the master branch.
>>
>> We have just started the release process for datasketches-cpp (version
>> 4.1.0). Once this is done, we will start the release process for
>> datasketches-postgress 1.6.0. In the meantime you may want to try the
>> latest code with the latest datasketches-cpp from the master branch.
>>
>>
>>
>> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>>
>> As noted in the linked issue, the postgresql 1.5 package is compatible
>> with the cpp 3.x line, not 4.x. It should work fine with the last
>> datasketches-cpp 3.x release.
>>
>>
>>
>> In the meantime, as noted, we are actively trying to work on speed
>> improvements for HLL as requested at the start of this thread.
>>
>>
>>
>> Additionally, one thing that can help speed releases is to vote whenever
>> there's a vote announcement -- even a non-binding vote is valuable!
>>
>>
>>
>>   jon
>>
>>
>>
>> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello All,
>>
>> We are trying to install new version of datasketches in our postgres
>> instance. I have downloaded datasketches-postgresql 1.5.0
>> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
>> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
>> 1.81.0. I have followed the same steps as mentioned in the readme file.
>> While executing the make command, I faced an error:
>>
>> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
>> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
>> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
>> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
>> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
>> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
>> -Idatasketches-cpp/req/include -I. -I./
>> -I/pgbin/mbi1d/12.x/include/postgresql/server
>> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
>> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
>> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
>> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of
>> template arguments (4, should be 3)
>> typedef datasketches::kll_sketch<float, std::less<float>,
>> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>>
>> ^
>> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
>> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
>> ‘template<class T, class C, class A> class datasketches::kll_sketch’
>> class kll_sketch {
>>
>> Looks like there is a mismatch of arguments in
>> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
>> Could you please suggest a solution. Thank you!
>>
>> https://github.com/apache/datasketches-postgresql/issues/62
>> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>>
>> *Datasketches Distinct count postgres extension algorithm is used in our
>> applications to get very prominent business value, therefor if we cannot
>> upgrade the versions, it would be a bigg loss for us.*
>>
>> *Could you please guide us what could be the best approach to overcome
>> this?*
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Saturday, 15 April 2023 at 12:05 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> I am not sure about the date. I think the development should take a few
>> days. A formal Apache release will take substantially more time just to go
>> through the required steps of voting for the core library release (not
>> really necessary for the parallel execution, but necessary to bring the
>> latest speed improvements into PostgreSQL extension), and then going
>> through the same procedure to release the extension.
>>
>> Of course, you don't have to wait for the formal release to start testing.
>>
>> Could you clarify your issues building the latest version please? I
>> believe that the datasketches-postgresql code in the master branch is
>> compatible with the latest datasketches-cpp code.
>>
>>
>>
>> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello Alexander,
>>
>>
>>
>> Do you have any date in mind, for releasing the same to have parallel
>> execution?
>>
>> Also we tried upgrading datasketches version from latest documentation,
>> we are getting lot of C++ version issues.
>>
>> Its very tough to install the new version. Any thoughts?
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Friday, 14 April 2023 at 10:58 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Hi Rima,
>>
>> I am working on the datasketches extension to support parallel queries
>> (distributed aggregation).
>>
>> I expect to get this done in a matter of days.
>>
>> Also we have just made some improvements to HLL merge speed in the core
>> library. These changes were not released yet, but available in the master
>> branch.
>>
>> We have another HLL performance improvement in mind. I will work on it
>> once I finish the parallel query support.
>>
>>
>>
>>
>>
>> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello Team,
>>
>>
>>
>> Here is the snapshot of the existing application:
>>
>>
>>
>> TechStack: Postgres DB, Hive, Tableau UI
>>
>> Postgres Plugin: DataSketches
>>
>>
>>
>> Flow in brief:
>>
>>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>>    datasketches algo) active card data, along with other details to Hive.
>>    - Another job populates that data to Postgres DB, finally having 3
>>    years data of 4 regions for multiple countries.
>>    - Tableau dashboard having live connection to Postgres DB.
>>    - Tableau Query calling Postgres DB, to aggregate the
>>    binary/pre-aggregated data to get distinct card count (using DataSketches
>>    algorithm) and fetch data based on multiple filter conditions.
>>    - Usually data would be of 3yrs for the span of 2 months, means total
>>    6 months of data to aggregate for a country on multiple conditions.
>>
>>
>>
>> Usually this aggregation query response is quite slow. We have tried lot
>> of different ways to resolve this,
>>
>>
>>
>> Mainly datasketches part is making most of the time in execution.
>>
>>
>>
>> Thanks & Regards,
>>
>> Rima Bhowmick
>>
>> Marketing Brand Analytics
>>
>> *Error! Filename not specified.*
>>
>>

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
alter extension datasketches update to '1.6.0';

On Mon, May 29, 2023 at 8:58 AM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Hello All,
>
> Please help us to fix the current issue we are facing to migrate from 1.3
> to 1.6 version.
>
>
>
> Error:
>
> mbi1d01=# create extension datasketches;
> ERROR:  extension "datasketches" already exists
>
> mbi1d01=# drop extension datasketches;
> ERROR:  cannot drop extension datasketches because other objects depend on
> it
> DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on
> type mbi.hll_sketch
>
>
>
> Question:
>
>  We already using v1.3. When we run create extension it throws an error
> that extension already exists. We cannot drop the extension as we get a
> warning that hll_sketch column will be dropped from tables. Is it possible
> to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
>
> We tried using below command to update the extension but we can get below
> error:
>
> mbi1d01=# alter extension datasketches update;
> ERROR:  could not find function "pg_theta_intersection_get_result" in file
> "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Wednesday, 24 May 2023 at 6:45 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I managed to reproduce the problem. It seems to me that such an old GCC
> 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.
>
>
>
> On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>
> wrote:
>
> What is your PostgreSQL version?
>
>
>
> On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Here is the C++ version
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 9:39 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> What is your compiler?
>
> Try "c++ --version"
>
>
>
> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Just to add on here the versions used.
>
>
>
>    1. datasketches-1.6.0 with datasketches-cpp included.
>    2. boost_1_80_0
>
>
>
> here is folder structure:
>
>
>
> Error after “make install”
>
>
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Not able to install the latest version of datasketches.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 8:38 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hello All,
>
>
>
> Facing this error while installing the dataSketches 1.6 version and doing
> “make install”.
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Please guide!
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Monday, 22 May 2023 at 10:12 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> This is not an error. This is a warning.
>
>
>
> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hi Alexander,
>
>
>
> We are trying to upgrade datasketches-postgresSQL extension into one of
> our linux servers.
>
>
>
> We are getting this error:
>
>
>
> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
> src/aod_sketch_c_adapter.cpp
> In file included from
> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>                  from boost/boost/math/special_functions/beta.hpp:13,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
> language standard to use Boost.Math will be C++14 starting in July 2023
> (Boost 1.82 release)" [-Wcpp]
> #    warning "The minimum language standard to use Boost.Math will be
> C++14 starting in July 2023 (Boost 1.82 release)"
>       ^
> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>                  from boost/boost/math/special_functions/gamma.hpp:18,
>                  from boost/boost/math/special_functions/beta.hpp:15,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
> boost::math::tools::integer_scalar_type<long double, true>’:
> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double; uintmax_t = long unsigned int]’
> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double]’
>
>
>
> We are using boost_1_80_0 version, please let us know if you have any clue
> what could be wrong?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 19 May 2023 at 1:54 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
> Just download, unpack and make a link so that the "make" can find it.
>
> I am afraid I don't quite understand your question. I would suggest
> following the Readme and asking specific questions about what is not clear
> or what goes wrong.
>
>
>
> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Hello All,
Please help us to fix the current issue we are facing to migrate from 1.3 to 1.6 version.

Error:
mbi1d01=# create extension datasketches;
ERROR:  extension "datasketches" already exists
mbi1d01=# drop extension datasketches;
ERROR:  cannot drop extension datasketches because other objects depend on it
DETAIL:  column actv_crd of table mbi.tmbaf_skyline_agg_bkp depends on type mbi.hll_sketch

Question:
 We already using v1.3. When we run create extension it throws an error that extension already exists. We cannot drop the extension as we get a warning that hll_sketch column will be dropped from tables. Is it possible to upgrade from 1.3 to 1.6 directly without dropping the extension. If yes,
We tried using below command to update the extension but we can get below error:
mbi1d01=# alter extension datasketches update;
ERROR:  could not find function "pg_theta_intersection_get_result" in file "/pgbin/mbi1d/14.x/lib/postgresql/datasketches.so"


Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Wednesday, 24 May 2023 at 6:45 AM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

I managed to reproduce the problem. It seems to me that such an old GCC 4.8.5 needs an older Boost. Try version 1.75.0. It works for me.

On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>> wrote:
What is your PostgreSQL version?

On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Here is the C++ version

[cid:image001.png@01D99274.6EC20150]

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 9:39 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

What is your compiler?
Try "c++ --version"

On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Just to add on here the versions used.


  1.  datasketches-1.6.0 with datasketches-cpp included.
  2.  boost_1_80_0

here is folder structure:
[cid:image002.png@01D99274.6EC20150]

Error after “make install”

src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Not able to install the latest version of datasketches.

Thanks,
Rima Bhowmick.

From: "Bhowmick, Rima" <rb...@visa.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 8:38 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hello All,

Facing this error while installing the dataSketches 1.6 version and doing “make install”.
src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Please guide!

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Monday, 22 May 2023 at 10:12 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
I managed to reproduce the problem. It seems to me that such an old GCC
4.8.5 needs an older Boost. Try version 1.75.0. It works for me.

On Tue, May 23, 2023 at 2:47 PM Alexander Saydakov <sa...@yahooinc.com>
wrote:

> What is your PostgreSQL version?
>
> On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
>> Here is the C++ version
>>
>>
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Tuesday, 23 May 2023 at 9:39 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> What is your compiler?
>>
>> Try "c++ --version"
>>
>>
>>
>> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Just to add on here the versions used.
>>
>>
>>
>>    1. datasketches-1.6.0 with datasketches-cpp included.
>>    2. boost_1_80_0
>>
>>
>>
>> here is folder structure:
>>
>>
>>
>> Error after “make install”
>>
>>
>>
>> src/aod_sketch_c_adapter.cpp:310:56:   required from here
>> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
>> not a class, struct, or union type
>>        using value_type = typename T::value_type;
>>                                                 ^
>> make: *** [src/aod_sketch_c_adapter.o] Error 1
>>
>>
>>
>> Not able to install the latest version of datasketches.
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Tuesday, 23 May 2023 at 8:38 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Hello All,
>>
>>
>>
>> Facing this error while installing the dataSketches 1.6 version and doing
>> “make install”.
>>
>> src/aod_sketch_c_adapter.cpp:310:56:   required from here
>> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
>> not a class, struct, or union type
>>        using value_type = typename T::value_type;
>>                                                 ^
>> make: *** [src/aod_sketch_c_adapter.o] Error 1
>>
>>
>>
>> Please guide!
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Monday, 22 May 2023 at 10:12 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> This is not an error. This is a warning.
>>
>>
>>
>> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hi Alexander,
>>
>>
>>
>> We are trying to upgrade datasketches-postgresSQL extension into one of
>> our linux servers.
>>
>>
>>
>> We are getting this error:
>>
>>
>>
>> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
>> src/aod_sketch_c_adapter.cpp
>> In file included from
>> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>>                  from boost/boost/math/special_functions/beta.hpp:13,
>>                  from boost/boost/math/distributions/students_t.hpp:16,
>>                  from src/aod_sketch_c_adapter.cpp:34:
>> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
>> language standard to use Boost.Math will be C++14 starting in July 2023
>> (Boost 1.82 release)" [-Wcpp]
>> #    warning "The minimum language standard to use Boost.Math will be
>> C++14 starting in July 2023 (Boost 1.82 release)"
>>       ^
>> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>>                  from boost/boost/math/special_functions/gamma.hpp:18,
>>                  from boost/boost/math/special_functions/beta.hpp:15,
>>                  from boost/boost/math/distributions/students_t.hpp:16,
>>                  from src/aod_sketch_c_adapter.cpp:34:
>> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
>> boost::math::tools::integer_scalar_type<long double, true>’:
>> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
>> boost::math::tools::detail::fraction_traits<Gen>::result_type
>> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
>> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
>> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
>> long double; uintmax_t = long unsigned int]’
>> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
>> boost::math::tools::detail::fraction_traits<Gen>::result_type
>> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
>> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
>> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
>> long double]’
>>
>>
>>
>> We are using boost_1_80_0 version, please let us know if you have any
>> clue what could be wrong?
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Friday, 19 May 2023 at 1:54 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
>> Just download, unpack and make a link so that the "make" can find it.
>>
>> I am afraid I don't quite understand your question. I would suggest
>> following the Readme and asking specific questions about what is not clear
>> or what goes wrong.
>>
>>
>>
>> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>>
>> Today we downloaded the package *PGXN
>> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
>> website, is it mandatory to install the Boost package too?
>>
>> While installing 1.3 version of Postgres dataSketches plugin earlier, we
>> didn’t use Boost then.
>>
>>
>>
>> Also to install are the below steps are sufficient as mentioned in
>> documentation?
>>
>> *Building and installing*
>>
>>    - make
>>    - sudo make install
>>
>> Thanks in advance!
>>
>>
>>
>> Regards,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Thursday, 27 April 2023 at 1:25 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> The changes in question have been merged to the master branch.
>>
>> We have just started the release process for datasketches-cpp (version
>> 4.1.0). Once this is done, we will start the release process for
>> datasketches-postgress 1.6.0. In the meantime you may want to try the
>> latest code with the latest datasketches-cpp from the master branch.
>>
>>
>>
>> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>>
>> As noted in the linked issue, the postgresql 1.5 package is compatible
>> with the cpp 3.x line, not 4.x. It should work fine with the last
>> datasketches-cpp 3.x release.
>>
>>
>>
>> In the meantime, as noted, we are actively trying to work on speed
>> improvements for HLL as requested at the start of this thread.
>>
>>
>>
>> Additionally, one thing that can help speed releases is to vote whenever
>> there's a vote announcement -- even a non-binding vote is valuable!
>>
>>
>>
>>   jon
>>
>>
>>
>> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello All,
>>
>> We are trying to install new version of datasketches in our postgres
>> instance. I have downloaded datasketches-postgresql 1.5.0
>> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
>> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
>> 1.81.0. I have followed the same steps as mentioned in the readme file.
>> While executing the make command, I faced an error:
>>
>> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
>> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
>> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
>> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
>> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
>> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
>> -Idatasketches-cpp/req/include -I. -I./
>> -I/pgbin/mbi1d/12.x/include/postgresql/server
>> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
>> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
>> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
>> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of
>> template arguments (4, should be 3)
>> typedef datasketches::kll_sketch<float, std::less<float>,
>> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>>
>> ^
>> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
>> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
>> ‘template<class T, class C, class A> class datasketches::kll_sketch’
>> class kll_sketch {
>>
>> Looks like there is a mismatch of arguments in
>> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
>> Could you please suggest a solution. Thank you!
>>
>> https://github.com/apache/datasketches-postgresql/issues/62
>> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>>
>> *Datasketches Distinct count postgres extension algorithm is used in our
>> applications to get very prominent business value, therefor if we cannot
>> upgrade the versions, it would be a bigg loss for us.*
>>
>> *Could you please guide us what could be the best approach to overcome
>> this?*
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Saturday, 15 April 2023 at 12:05 AM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> I am not sure about the date. I think the development should take a few
>> days. A formal Apache release will take substantially more time just to go
>> through the required steps of voting for the core library release (not
>> really necessary for the parallel execution, but necessary to bring the
>> latest speed improvements into PostgreSQL extension), and then going
>> through the same procedure to release the extension.
>>
>> Of course, you don't have to wait for the formal release to start testing.
>>
>> Could you clarify your issues building the latest version please? I
>> believe that the datasketches-postgresql code in the master branch is
>> compatible with the latest datasketches-cpp code.
>>
>>
>>
>> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello Alexander,
>>
>>
>>
>> Do you have any date in mind, for releasing the same to have parallel
>> execution?
>>
>> Also we tried upgrading datasketches version from latest documentation,
>> we are getting lot of C++ version issues.
>>
>> Its very tough to install the new version. Any thoughts?
>>
>>
>>
>> Thanks,
>>
>> Rima Bhowmick.
>>
>>
>>
>> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
>> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Date: *Friday, 14 April 2023 at 10:58 PM
>> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
>> *Subject: *Re: [E] Postgres HLL is very slow
>>
>>
>>
>> Hi Rima,
>>
>> I am working on the datasketches extension to support parallel queries
>> (distributed aggregation).
>>
>> I expect to get this done in a matter of days.
>>
>> Also we have just made some improvements to HLL merge speed in the core
>> library. These changes were not released yet, but available in the master
>> branch.
>>
>> We have another HLL performance improvement in mind. I will work on it
>> once I finish the parallel query support.
>>
>>
>>
>>
>>
>> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
>> wrote:
>>
>> Hello Team,
>>
>>
>>
>> Here is the snapshot of the existing application:
>>
>>
>>
>> TechStack: Postgres DB, Hive, Tableau UI
>>
>> Postgres Plugin: DataSketches
>>
>>
>>
>> Flow in brief:
>>
>>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>>    datasketches algo) active card data, along with other details to Hive.
>>    - Another job populates that data to Postgres DB, finally having 3
>>    years data of 4 regions for multiple countries.
>>    - Tableau dashboard having live connection to Postgres DB.
>>    - Tableau Query calling Postgres DB, to aggregate the
>>    binary/pre-aggregated data to get distinct card count (using DataSketches
>>    algorithm) and fetch data based on multiple filter conditions.
>>    - Usually data would be of 3yrs for the span of 2 months, means total
>>    6 months of data to aggregate for a country on multiple conditions.
>>
>>
>>
>> Usually this aggregation query response is quite slow. We have tried lot
>> of different ways to resolve this,
>>
>>
>>
>> Mainly datasketches part is making most of the time in execution.
>>
>>
>>
>> Thanks & Regards,
>>
>> Rima Bhowmick
>>
>> Marketing Brand Analytics
>>
>> *Error! Filename not specified.*
>>
>>

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
What is your PostgreSQL version?

On Tue, May 23, 2023 at 9:35 AM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Here is the C++ version
>
>
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 9:39 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> What is your compiler?
>
> Try "c++ --version"
>
>
>
> On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Just to add on here the versions used.
>
>
>
>    1. datasketches-1.6.0 with datasketches-cpp included.
>    2. boost_1_80_0
>
>
>
> here is folder structure:
>
>
>
> Error after “make install”
>
>
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Not able to install the latest version of datasketches.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 8:38 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hello All,
>
>
>
> Facing this error while installing the dataSketches 1.6 version and doing
> “make install”.
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Please guide!
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Monday, 22 May 2023 at 10:12 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> This is not an error. This is a warning.
>
>
>
> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hi Alexander,
>
>
>
> We are trying to upgrade datasketches-postgresSQL extension into one of
> our linux servers.
>
>
>
> We are getting this error:
>
>
>
> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
> src/aod_sketch_c_adapter.cpp
> In file included from
> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>                  from boost/boost/math/special_functions/beta.hpp:13,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
> language standard to use Boost.Math will be C++14 starting in July 2023
> (Boost 1.82 release)" [-Wcpp]
> #    warning "The minimum language standard to use Boost.Math will be
> C++14 starting in July 2023 (Boost 1.82 release)"
>       ^
> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>                  from boost/boost/math/special_functions/gamma.hpp:18,
>                  from boost/boost/math/special_functions/beta.hpp:15,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
> boost::math::tools::integer_scalar_type<long double, true>’:
> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double; uintmax_t = long unsigned int]’
> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double]’
>
>
>
> We are using boost_1_80_0 version, please let us know if you have any clue
> what could be wrong?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 19 May 2023 at 1:54 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
> Just download, unpack and make a link so that the "make" can find it.
>
> I am afraid I don't quite understand your question. I would suggest
> following the Readme and asking specific questions about what is not clear
> or what goes wrong.
>
>
>
> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Here is the C++ version

[cid:image001.png@01D98DC2.907F6460]

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Tuesday, 23 May 2023 at 9:39 PM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

What is your compiler?
Try "c++ --version"

On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Just to add on here the versions used.


  1.  datasketches-1.6.0 with datasketches-cpp included.
  2.  boost_1_80_0

here is folder structure:
[cid:image002.png@01D98DC2.907F6460]

Error after “make install”

src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Not able to install the latest version of datasketches.

Thanks,
Rima Bhowmick.

From: "Bhowmick, Rima" <rb...@visa.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Tuesday, 23 May 2023 at 8:38 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hello All,

Facing this error while installing the dataSketches 1.6 version and doing “make install”.
src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Please guide!

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Monday, 22 May 2023 at 10:12 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
What is your compiler?
Try "c++ --version"

On Mon, May 22, 2023 at 8:21 PM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Just to add on here the versions used.
>
>
>
>    1. datasketches-1.6.0 with datasketches-cpp included.
>    2. boost_1_80_0
>
>
>
> here is folder structure:
>
>
>
> Error after “make install”
>
>
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Not able to install the latest version of datasketches.
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *"Bhowmick, Rima" <rb...@visa.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Tuesday, 23 May 2023 at 8:38 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hello All,
>
>
>
> Facing this error while installing the dataSketches 1.6 version and doing
> “make install”.
>
> src/aod_sketch_c_adapter.cpp:310:56:   required from here
> /usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is
> not a class, struct, or union type
>        using value_type = typename T::value_type;
>                                                 ^
> make: *** [src/aod_sketch_c_adapter.o] Error 1
>
>
>
> Please guide!
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Monday, 22 May 2023 at 10:12 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> This is not an error. This is a warning.
>
>
>
> On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hi Alexander,
>
>
>
> We are trying to upgrade datasketches-postgresSQL extension into one of
> our linux servers.
>
>
>
> We are getting this error:
>
>
>
> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
> src/aod_sketch_c_adapter.cpp
> In file included from
> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>                  from boost/boost/math/special_functions/beta.hpp:13,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
> language standard to use Boost.Math will be C++14 starting in July 2023
> (Boost 1.82 release)" [-Wcpp]
> #    warning "The minimum language standard to use Boost.Math will be
> C++14 starting in July 2023 (Boost 1.82 release)"
>       ^
> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>                  from boost/boost/math/special_functions/gamma.hpp:18,
>                  from boost/boost/math/special_functions/beta.hpp:15,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
> boost::math::tools::integer_scalar_type<long double, true>’:
> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double; uintmax_t = long unsigned int]’
> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double]’
>
>
>
> We are using boost_1_80_0 version, please let us know if you have any clue
> what could be wrong?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 19 May 2023 at 1:54 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
> Just download, unpack and make a link so that the "make" can find it.
>
> I am afraid I don't quite understand your question. I would suggest
> following the Readme and asking specific questions about what is not clear
> or what goes wrong.
>
>
>
> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Just to add on here the versions used.


  1.  datasketches-1.6.0 with datasketches-cpp included.
  2.  boost_1_80_0

here is folder structure:
[cid:image001.png@01D98D53.AA412BA0]

Error after “make install”

src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Not able to install the latest version of datasketches.

Thanks,
Rima Bhowmick.

From: "Bhowmick, Rima" <rb...@visa.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Tuesday, 23 May 2023 at 8:38 AM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

Hello All,

Facing this error while installing the dataSketches 1.6 version and doing “make install”.
src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Please guide!

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Monday, 22 May 2023 at 10:12 PM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Hello All,

Facing this error while installing the dataSketches 1.6 version and doing “make install”.
src/aod_sketch_c_adapter.cpp:310:56:   required from here
/usr/include/boost/math/tools/fraction.hpp:84:48: error: ‘long double’ is not a class, struct, or union type
       using value_type = typename T::value_type;
                                                ^
make: *** [src/aod_sketch_c_adapter.o] Error 1

Please guide!

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Monday, 22 May 2023 at 10:12 PM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
This is not an error. This is a warning.

On Mon, May 22, 2023 at 2:55 AM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Hi Alexander,
>
>
>
> We are trying to upgrade datasketches-postgresSQL extension into one of
> our linux servers.
>
>
>
> We are getting this error:
>
>
>
> sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o
> src/aod_sketch_c_adapter.cpp
> In file included from
> boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
>                  from boost/boost/math/special_functions/math_fwd.hpp:29,
>                  from boost/boost/math/special_functions/beta.hpp:13,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum
> language standard to use Boost.Math will be C++14 starting in July 2023
> (Boost 1.82 release)" [-Wcpp]
> #    warning "The minimum language standard to use Boost.Math will be
> C++14 starting in July 2023 (Boost 1.82 release)"
>       ^
> In file included from boost/boost/math/tools/fraction.hpp:14:0,
>                  from boost/boost/math/special_functions/gamma.hpp:18,
>                  from boost/boost/math/special_functions/beta.hpp:15,
>                  from boost/boost/math/distributions/students_t.hpp:16,
>                  from src/aod_sketch_c_adapter.cpp:34:
> boost/boost/math/tools/complex.hpp: In instantiation of ‘struct
> boost::math::tools::integer_scalar_type<long double, true>’:
> boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with
> Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double; uintmax_t = long unsigned int]’
> boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename
> boost::math::tools::detail::fraction_traits<Gen>::result_type
> boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen =
> boost::math::detail::ibeta_fraction2_t<long double>; U = long double;
> typename boost::math::tools::detail::fraction_traits<Gen>::result_type =
> long double]’
>
>
>
> We are using boost_1_80_0 version, please let us know if you have any clue
> what could be wrong?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 19 May 2023 at 1:54 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Yes, version 1.6.0 does depend on Boost. There is no need to install it.
> Just download, unpack and make a link so that the "make" can find it.
>
> I am afraid I don't quite understand your question. I would suggest
> following the Readme and asking specific questions about what is not clear
> or what goes wrong.
>
>
>
> On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>

Re: [E] Postgres HLL is very slow

Posted by "Bhowmick, Rima" <rb...@visa.com.INVALID>.
Hi Alexander,

We are trying to upgrade datasketches-postgresSQL extension into one of our linux servers.

We are getting this error:

sr/include/libxml2   -c -o src/aod_sketch_c_adapter.o src/aod_sketch_c_adapter.cpp
In file included from boost/boost/math/special_functions/detail/round_fwd.hpp:11:0,
                 from boost/boost/math/special_functions/math_fwd.hpp:29,
                 from boost/boost/math/special_functions/beta.hpp:13,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/config.hpp:23:6: warning: #warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)" [-Wcpp]
#    warning "The minimum language standard to use Boost.Math will be C++14 starting in July 2023 (Boost 1.82 release)"
      ^
In file included from boost/boost/math/tools/fraction.hpp:14:0,
                 from boost/boost/math/special_functions/gamma.hpp:18,
                 from boost/boost/math/special_functions/beta.hpp:15,
                 from boost/boost/math/distributions/students_t.hpp:16,
                 from src/aod_sketch_c_adapter.cpp:34:
boost/boost/math/tools/complex.hpp: In instantiation of ‘struct boost::math::tools::integer_scalar_type<long double, true>’:
boost/boost/math/tools/fraction.hpp:115:72:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&, uintmax_t&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double; uintmax_t = long unsigned int]’
boost/boost/math/tools/fraction.hpp:156:52:   required from ‘typename boost::math::tools::detail::fraction_traits<Gen>::result_type boost::math::tools::continued_fraction_b(Gen&, const U&) [with Gen = boost::math::detail::ibeta_fraction2_t<long double>; U = long double; typename boost::math::tools::detail::fraction_traits<Gen>::result_type = long double]’

We are using boost_1_80_0 version, please let us know if you have any clue what could be wrong?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Date: Friday, 19 May 2023 at 1:54 AM
To: "dev@datasketches.apache.org" <de...@datasketches.apache.org>
Subject: Re: [E] Postgres HLL is very slow

Yes, version 1.6.0 does depend on Boost. There is no need to install it. Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest following the Readme and asking specific questions about what is not clear or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Thanks for making the dataSketches1.6 version live, it will help us a lot.
Today we downloaded the package PGXN<https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$> website, is it mandatory to install the Boost package too?
While installing 1.3 version of Postgres dataSketches plugin earlier, we didn’t use Boost then.

Also to install are the below steps are sufficient as mentioned in documentation?
Building and installing

  *   make
  *   sudo make install
Thanks in advance!

Regards,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Thursday, 27 April 2023 at 1:25 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

The changes in question have been merged to the master branch.
We have just started the release process for datasketches-cpp (version 4.1.0). Once this is done, we will start the release process for datasketches-postgress 1.6.0. In the meantime you may want to try the latest code with the latest datasketches-cpp from the master branch.

On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org>> wrote:
As noted in the linked issue, the postgresql 1.5 package is compatible with the cpp 3.x line, not 4.x. It should work fine with the last datasketches-cpp 3.x release.

In the meantime, as noted, we are actively trying to work on speed improvements for HLL as requested at the start of this thread.

Additionally, one thing that can help speed releases is to vote whenever there's a vote announcement -- even a non-binding vote is valuable!

  jon

On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:

Hello All,

We are trying to install new version of datasketches in our postgres instance. I have downloaded datasketches-postgresql 1.5.0 (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1 (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost 1.81.0. I have followed the same steps as mentioned in the readme file. While executing the make command, I faced an error:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC -I/usr/local/include -Iboost -Idatasketches-cpp/common/include -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include -Idatasketches-cpp/req/include -I. -I./ -I/pgbin/mbi1d/12.x/include/postgresql/server -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE -I/pgbin/mbi1d/12.x//include/libxml2   -c -o src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template arguments (4, should be 3)
typedef datasketches::kll_sketch<float, std::less<float>, datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
                                                                                                             ^
In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for ‘template<class T, class C, class A> class datasketches::kll_sketch’
class kll_sketch {

Looks like there is a mismatch of arguments in kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
Could you please suggest a solution. Thank you!

https://github.com/apache/datasketches-postgresql/issues/62<https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
Datasketches Distinct count postgres extension algorithm is used in our applications to get very prominent business value, therefor if we cannot upgrade the versions, it would be a bigg loss for us.
Could you please guide us what could be the best approach to overcome this?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply to: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Saturday, 15 April 2023 at 12:05 AM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

I am not sure about the date. I think the development should take a few days. A formal Apache release will take substantially more time just to go through the required steps of voting for the core library release (not really necessary for the parallel execution, but necessary to bring the latest speed improvements into PostgreSQL extension), and then going through the same procedure to release the extension.
Of course, you don't have to wait for the formal release to start testing.
Could you clarify your issues building the latest version please? I believe that the datasketches-postgresql code in the master branch is compatible with the latest datasketches-cpp code.

On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Alexander,

Do you have any date in mind, for releasing the same to have parallel execution?
Also we tried upgrading datasketches version from latest documentation, we are getting lot of C++ version issues.
Its very tough to install the new version. Any thoughts?

Thanks,
Rima Bhowmick.

From: Alexander Saydakov <sa...@yahooinc.com.INVALID>
Reply-To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Date: Friday, 14 April 2023 at 10:58 PM
To: "dev@datasketches.apache.org<ma...@datasketches.apache.org>" <de...@datasketches.apache.org>>
Subject: Re: [E] Postgres HLL is very slow

Hi Rima,
I am working on the datasketches extension to support parallel queries (distributed aggregation).
I expect to get this done in a matter of days.
Also we have just made some improvements to HLL merge speed in the core library. These changes were not released yet, but available in the master branch.
We have another HLL performance improvement in mind. I will work on it once I finish the parallel query support.


On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid> wrote:
Hello Team,

Here is the snapshot of the existing application:

TechStack: Postgres DB, Hive, Tableau UI
Postgres Plugin: DataSketches

Flow in brief:

  *   Hadoop Data pipeline job pushes pre-aggregated(using hive datasketches algo) active card data, along with other details to Hive.
  *   Another job populates that data to Postgres DB, finally having 3 years data of 4 regions for multiple countries.
  *   Tableau dashboard having live connection to Postgres DB.
  *   Tableau Query calling Postgres DB, to aggregate the binary/pre-aggregated data to get distinct card count (using DataSketches algorithm) and fetch data based on multiple filter conditions.
  *   Usually data would be of 3yrs for the span of 2 months, means total 6 months of data to aggregate for a country on multiple conditions.

Usually this aggregation query response is quite slow. We have tried lot of different ways to resolve this,

Mainly datasketches part is making most of the time in execution.

Thanks & Regards,
Rima Bhowmick
Marketing Brand Analytics
Error! Filename not specified.

Re: [E] Postgres HLL is very slow

Posted by Alexander Saydakov <sa...@yahooinc.com.INVALID>.
Yes, version 1.6.0 does depend on Boost. There is no need to install it.
Just download, unpack and make a link so that the "make" can find it.
I am afraid I don't quite understand your question. I would suggest
following the Readme and asking specific questions about what is not clear
or what goes wrong.

On Wed, May 17, 2023 at 6:32 AM Bhowmick, Rima <rb...@visa.com.invalid>
wrote:

> Thanks for making the dataSketches1.6 version live, it will help us a lot.
>
> Today we downloaded the package *PGXN
> <https://urldefense.com/v3/__https://pgxn.org/dist/datasketches/__;!!Op6eflyXZCqGR5I!Hsl5aj72x0KpYPkziCaV1WKj1Qw2olWIRzEkhx95orMBJyA1UDGGwiAhuCU8KfPK9EUNX_x2AK_b_-z0gOYuCuQJ$>*
> website, is it mandatory to install the Boost package too?
>
> While installing 1.3 version of Postgres dataSketches plugin earlier, we
> didn’t use Boost then.
>
>
>
> Also to install are the below steps are sufficient as mentioned in
> documentation?
>
> *Building and installing*
>
>    - make
>    - sudo make install
>
> Thanks in advance!
>
>
>
> Regards,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Thursday, 27 April 2023 at 1:25 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> The changes in question have been merged to the master branch.
>
> We have just started the release process for datasketches-cpp (version
> 4.1.0). Once this is done, we will start the release process for
> datasketches-postgress 1.6.0. In the meantime you may want to try the
> latest code with the latest datasketches-cpp from the master branch.
>
>
>
> On Wed, Apr 19, 2023 at 12:58 AM Jon Malkin <jm...@apache.org> wrote:
>
> As noted in the linked issue, the postgresql 1.5 package is compatible
> with the cpp 3.x line, not 4.x. It should work fine with the last
> datasketches-cpp 3.x release.
>
>
>
> In the meantime, as noted, we are actively trying to work on speed
> improvements for HLL as requested at the start of this thread.
>
>
>
> Additionally, one thing that can help speed releases is to vote whenever
> there's a vote announcement -- even a non-binding vote is valuable!
>
>
>
>   jon
>
>
>
> On Wed, Apr 19, 2023, 12:13 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello All,
>
> We are trying to install new version of datasketches in our postgres
> instance. I have downloaded datasketches-postgresql 1.5.0
> (apache-datasketches-postgresql-1.5.0-src.zip), datasketches-cpp 4.0.1
> (apache-datasketches-cpp-4.0.1-src.zip) from apache website and boost
> 1.81.0. I have followed the same steps as mentioned in the readme file.
> While executing the make command, I faced an error:
>
> g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
> -Wformat-security -fno-strict-aliasing -fwrapv -O2 -std=c++11 -fPIC -fPIC
> -I/usr/local/include -Iboost -Idatasketches-cpp/common/include
> -Idatasketches-cpp/kll/include -Idatasketches-cpp/cpc/include
> -Idatasketches-cpp/theta/include -Idatasketches-cpp/fi/include
> -Idatasketches-cpp/hll/include -Idatasketches-cpp/tuple/include
> -Idatasketches-cpp/req/include -I. -I./
> -I/pgbin/mbi1d/12.x/include/postgresql/server
> -I/pgbin/mbi1d/12.x/include/postgresql/internal  -D_GNU_SOURCE
> -I/pgbin/mbi1d/12.x//include/libxml2   -c -o
> src/kll_float_sketch_c_adapter.o src/kll_float_sketch_c_adapter.cpp
> src/kll_float_sketch_c_adapter.cpp:26:109: error: wrong number of template
> arguments (4, should be 3)
> typedef datasketches::kll_sketch<float, std::less<float>,
> datasketches::serde<float>, palloc_allocator<float>> kll_float_sketch;
>
> ^
> In file included from src/kll_float_sketch_c_adapter.cpp:24:0:
> datasketches-cpp/kll/include/kll_sketch.hpp:158:7: error: provided for
> ‘template<class T, class C, class A> class datasketches::kll_sketch’
> class kll_sketch {
>
> Looks like there is a mismatch of arguments in
> kll_float_sketch_c_adapter.cpp and kll_sketch.hpp.
> Could you please suggest a solution. Thank you!
>
> https://github.com/apache/datasketches-postgresql/issues/62
> <https://urldefense.com/v3/__https://github.com/apache/datasketches-postgresql/issues/62__;!!Op6eflyXZCqGR5I!AXYYf_BpeznMsFEbt8pJ4V5PV7QlzoTCJBji7ph7ERc1GUSjX1JBNUm6yS8ThWoqZNtMlh5R5l4DZo9-Lw$>
>
> *Datasketches Distinct count postgres extension algorithm is used in our
> applications to get very prominent business value, therefor if we cannot
> upgrade the versions, it would be a bigg loss for us.*
>
> *Could you please guide us what could be the best approach to overcome
> this?*
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply to: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Saturday, 15 April 2023 at 12:05 AM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> I am not sure about the date. I think the development should take a few
> days. A formal Apache release will take substantially more time just to go
> through the required steps of voting for the core library release (not
> really necessary for the parallel execution, but necessary to bring the
> latest speed improvements into PostgreSQL extension), and then going
> through the same procedure to release the extension.
>
> Of course, you don't have to wait for the formal release to start testing.
>
> Could you clarify your issues building the latest version please? I
> believe that the datasketches-postgresql code in the master branch is
> compatible with the latest datasketches-cpp code.
>
>
>
> On Fri, Apr 14, 2023 at 11:22 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Alexander,
>
>
>
> Do you have any date in mind, for releasing the same to have parallel
> execution?
>
> Also we tried upgrading datasketches version from latest documentation, we
> are getting lot of C++ version issues.
>
> Its very tough to install the new version. Any thoughts?
>
>
>
> Thanks,
>
> Rima Bhowmick.
>
>
>
> *From: *Alexander Saydakov <sa...@yahooinc.com.INVALID>
> *Reply-To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Date: *Friday, 14 April 2023 at 10:58 PM
> *To: *"dev@datasketches.apache.org" <de...@datasketches.apache.org>
> *Subject: *Re: [E] Postgres HLL is very slow
>
>
>
> Hi Rima,
>
> I am working on the datasketches extension to support parallel queries
> (distributed aggregation).
>
> I expect to get this done in a matter of days.
>
> Also we have just made some improvements to HLL merge speed in the core
> library. These changes were not released yet, but available in the master
> branch.
>
> We have another HLL performance improvement in mind. I will work on it
> once I finish the parallel query support.
>
>
>
>
>
> On Fri, Apr 14, 2023 at 3:33 AM Bhowmick, Rima <rb...@visa.com.invalid>
> wrote:
>
> Hello Team,
>
>
>
> Here is the snapshot of the existing application:
>
>
>
> TechStack: Postgres DB, Hive, Tableau UI
>
> Postgres Plugin: DataSketches
>
>
>
> Flow in brief:
>
>    - Hadoop Data pipeline job pushes pre-aggregated(using hive
>    datasketches algo) active card data, along with other details to Hive.
>    - Another job populates that data to Postgres DB, finally having 3
>    years data of 4 regions for multiple countries.
>    - Tableau dashboard having live connection to Postgres DB.
>    - Tableau Query calling Postgres DB, to aggregate the
>    binary/pre-aggregated data to get distinct card count (using DataSketches
>    algorithm) and fetch data based on multiple filter conditions.
>    - Usually data would be of 3yrs for the span of 2 months, means total
>    6 months of data to aggregate for a country on multiple conditions.
>
>
>
> Usually this aggregation query response is quite slow. We have tried lot
> of different ways to resolve this,
>
>
>
> Mainly datasketches part is making most of the time in execution.
>
>
>
> Thanks & Regards,
>
> Rima Bhowmick
>
> Marketing Brand Analytics
>
> *Error! Filename not specified.*
>
>